71
ITERATED FUNCTION SYSTEMS , THE CHAOS GAME AND INVARIANT MEASURES FREDRIK STR ¨ OMBERG Contents 1. Intro 1 2. Preliminaries 2 2.1. General topology 2 2.2. Basic theory of metric and normed spaces 5 2.3. Functional analysis 10 2.4. Measure theory 10 2.5. Two important metric spaces 26 3. Development of the IFS theory 27 3.1. Basic notions of IFSs 27 3.2. Invariants of an IFS 28 3.3. Invariant measure and the chaos game 31 3.4. Universal IFS 36 4. Fractal dimensions 41 4.1. Introducing the Hausdorff and Packing measures 41 4.2. Hausdorff, packing and box-counting dimensions 44 4.3. IFS and Fractal dimensions 50 References 67 1. Intro This paper is intended as an introduction to the theory of Iterated Function Systems (IFSs), and specially to the probabilistic approach, which leads to the notion of the invariant measure for an IFS. By an Iterated Function System we will mean here a complete metric space X together with a finite set of continuous maps on X, S = {S i } 1iN ,N < . There are some variations of requirements made on the metric space X found in the literature, some authors prefer the stronger requirement of compactness, but here we shall try as far as we can to consider X to be just complete. This gives some idea about the ’Function System’; it is called an Iterated Function System because we will use it to iterate the maps of S in two ways. Either we form the function W (A)= S i=1...N S i (A), (A a compact subset of X) and we iterate it (the so called deterministic approach), or we apply the maps S i in a somewhat random manner (the probabilistic approach). These things will be explained later. Date : June 3, 2001. 1

1. Intro · The main application of the theory is generating and passing information about fractals, especially the class of self-similar fractals. A fractal is de ned to be a set

  • Upload
    others

  • View
    0

  • Download
    0

Embed Size (px)

Citation preview

Page 1: 1. Intro · The main application of the theory is generating and passing information about fractals, especially the class of self-similar fractals. A fractal is de ned to be a set

ITERATED FUNCTION SYSTEMS , THE CHAOS GAME ANDINVARIANT MEASURES

FREDRIK STROMBERG

Contents

1. Intro 12. Preliminaries 22.1. General topology 22.2. Basic theory of metric and normed spaces 52.3. Functional analysis 102.4. Measure theory 102.5. Two important metric spaces 263. Development of the IFS theory 273.1. Basic notions of IFSs 273.2. Invariants of an IFS 283.3. Invariant measure and the chaos game 313.4. Universal IFS 364. Fractal dimensions 414.1. Introducing the Hausdorff and Packing measures 414.2. Hausdorff, packing and box-counting dimensions 444.3. IFS and Fractal dimensions 50References 67

1. Intro

This paper is intended as an introduction to the theory of Iterated Function Systems(IFSs), and specially to the probabilistic approach, which leads to the notion of theinvariant measure for an IFS.

By an Iterated Function System we will mean here a complete metric spaceX together with a finite set of continuous maps on X, S = Si1≤i≤N , N < ∞.There are some variations of requirements made on the metric space X found inthe literature, some authors prefer the stronger requirement of compactness, buthere we shall try as far as we can to consider X to be just complete. This givessome idea about the ’Function System’; it is called an Iterated Function Systembecause we will use it to iterate the maps of S in two ways. Either we form thefunction W (A) =

⋃i=1...N Si(A), (A a compact subset of X) and we iterate it (the

so called deterministic approach), or we apply the maps Si in a somewhat randommanner (the probabilistic approach). These things will be explained later.

Date: June 3, 2001.

1

Page 2: 1. Intro · The main application of the theory is generating and passing information about fractals, especially the class of self-similar fractals. A fractal is de ned to be a set

2 FREDRIK STROMBERG

The main application of the theory is generating and passing information aboutfractals, especially the class of self-similar fractals. A fractal is defined to be aset with a Hausdorff dimension (to be defined later) different from the topologicaldimension. This definition is due to Mandelbrot, a pioneer amongst the theory offractals.

The self-similarity of a set can loosely be defined as meaning just that a setlooks similar at different scales, that is if you look at a totally self-similar set withor without a magnifying glass you should not see any difference in the texture. Thisnotion is also going to be defined more precisely later.

Then the theory of fractals can be used for example to model various physicalphenomena, which was the main application of fractal theory in the early days.Today the main application probably is concerning information theory, mostly im-age -analysis and -compression, and it is people in these fields that have developedmost of the computational aspects of the theory.

The advantage with IFS’s as a method of image compression/decompression isthe ease with which one renders the picture from the IFS, using either the deter-ministic, iterative, or the probabilistic approach. We will devote most of our studyto the latter, the random walk or as Barnsley [2] calls it ’The Chaos Game’. Thesemethods will be explained further down.

To help us to translate an image into an IFS we have the so called ’CollageTheorem’, which tells us how good is our approximation of the original image withthe image generated by the IFS.

Remark 1 (The structure of this document).After the introduction we have the section ’Preliminaries’ that contains elementsof higher mathematics that some people interested in reading this paper might nothave seen before, but I only describe just the material we need, and there is somereferences of where to find more information. Then we come to sections that allare presenting the basic theory of IFS. The Definitions are numbered within eachsection (e.g. Definition 2.1 is the first definition in section 1 etc.), but otherwisethe numbering is independent of the section number (e.g. Theorem 1 is the firsttheorem in the paper etc.).

The reader experienced with the basic concepts of topology, functional analysisand measure theory can certainly skip most of the preliminaries and move to Section3 where the main IFS theory is developed, and to Section 4 where we discuss frac-tal dimensions, including Hausdorff, Packing and Box-counting dimensions. Thissection contains also an introduction to the Hausdorff measures.

2. Preliminaries

2.1. General topology

I will start defining one of the most general types of a space that one encounters inmathematics, the so called topological space. If this seems too abstract, look at thenext subsection for some examples of more concrete spaces (metric and normed).For a more comprehensive introduction to the area of topology see [21] or [1]

Definition 2.1 (Topology).Let X be a non-empty set, and let T be a family of subsets of X. T is called a

Page 3: 1. Intro · The main application of the theory is generating and passing information about fractals, especially the class of self-similar fractals. A fractal is de ned to be a set

ITERATED FUNCTION SYSTEMS , THE CHAOS GAME AND INVARIANT MEASURES 3

topology on X if

∅, X ∈ T ,(a)

Uα ∈ T , ∀α ∈ A =⇒⋃α∈A

Uα ∈ T ,(b)

U1, U1, . . . , Un ∈ T (a finite family of elements from T ) =⇒n⋂i=1

Ui ∈ T .(c)

Notation 1. If T is a topology on X, then (X, T ) is called a topological space.Sometimes one also says that X is a topological space, without an explicit referenceto a topology, but it is then understood that there is indeed a topology on X.

Example 1.If X is a set then Ttriv = ∅, X is a topology on X, this topology is called thetrivial topology on X.

Example 2.If X is a set then Tdisc = all subsets of X (sometimes denoted by 2X ) is atopology on X. This topology is called the discrete topology on X.

Notation 2. If (X, T ) is a topological space, then a set O ⊆ X is said to be openif O ∈ T , and a set B ⊆ X is said to be closed if the complement of B, (i.e. X \B)is open.

Notation 3. If (X, T ) is a topological space and x ∈ X, then an open set O ∈ Tis called a neighbourhood (or open neighbourhood) of x if x ∈ O.

Remark 2.It is clear that if Oαα∈A is a collection of open sets, then the set O =

⋃α∈AOα is

also open (in the same topology) , and likewise if Cββ∈B is a collection of closedsets, then the set C =

⋂β∈B Cβ is also closed (in the same topology) .

Definition 2.2.If (X, T ) is a topological space and A ⊆ X, then the closure of A, denoted by Ais defined by

A =⋂

A⊆B, B−closed

B

This set is the smallest closed set containing A.

Definition 2.3.If X is a topological space, and Y is a subset of X, such that Y = X, then Y issaid to be a dense subset of X.

We have different means to define a topology on a given space X, most commonis to use the concepts of an open base and open subbase of a topology, which willbe introduced now.

Page 4: 1. Intro · The main application of the theory is generating and passing information about fractals, especially the class of self-similar fractals. A fractal is de ned to be a set

4 FREDRIK STROMBERG

Definition 2.4.Let (X, T ) be a topological space. Then a collection U of open subsets of X is saidto be an open base (or simply base) of the topology T if for all sets T ∈ T , T 6= ∅,and all points x ∈ T , there exists a set U ∈ U s.t. x ∈ U ⊆ T . The sets in T arecalled basic open sets.

Definition 2.5.Let (X, T ) be a topological space. Then a collection S of open subsets of X iscalled an open subbase of the topology T if the set

C ⊆ X : C is a finite intersection of sets from S, C =N⋂i=1

Si, Si ∈ S

is an open base for the topology T .

Definition 2.6.Let X be a set, and let T1 and T2 be two topologies on X. If T1 ⊆ T2, then T1 issaid to be weaker than T2.

One of the things we achieve when working with topological spaces is the notionof continuous functions. In calculus the ordinary ε and δ definition of continuitygives the fact that if f is continuous function on R, then for all open subsets U ofR, the set f−1(U) is open. This gives a motivation for the following definition.

Definition 2.7 (continuous function).Let (X, T ) and (Y,D) be topological spaces, and let f : X → Y be a function fromX to Y . If for all D ∈ D we have f−1(D) ∈ T , then f is said to be continuous.

Remark 3.This implies that for the same spaces, X and Y we may have different sets ofcontinuous maps from X to Y , depending on the topologies T and D.

Example 3.If (X, T ) and (Y,D) are two topological spaces, let C(X,Y ) be the set of continuousfunctions from X to Y , then the following is easily verified :

(1) if T = Tdisc = 2X , or(2) if D = Ttriv = ∅, Y ,

then C(X,Y ) is equal to the set of all functions from X to Y .

One can also do this the other way around. Given two sets, X and Y and afamily F = fαα∈A of functions from X to Y , what topologies can we have on Xand Y that makes all functions in F continuous?

Definition 2.8.Let X be a set, and let (Y,D) be a topological space. Let F = fαα∈A be a familyof functions from X to Y .

Then there is a weakest topology TF on X, s.t. all functions in F , fα : X →Y, α ∈ A are continuous. This topology is called the weak topology generatedby F .

Page 5: 1. Intro · The main application of the theory is generating and passing information about fractals, especially the class of self-similar fractals. A fractal is de ned to be a set

ITERATED FUNCTION SYSTEMS , THE CHAOS GAME AND INVARIANT MEASURES 5

The separation properties of a topological space tell us if, for example, pointscan be separated from points by open sets, etc. This subject is an extensive partof topology, and I refer to the literature ( [21] and [22] ) for an introduction tothis subject. Here I will present only one definition, namely that of the so calledHausdorff spaces.

Definition 2.9.A topological space X is a Hausdorff space if

∀x 6= y ∈ X, ∃U, V ⊂ X s.t. U and V are open in X, x ∈ U, y ∈ V and U ∩ V = ∅.

Remark 4.The only thing we need to know about Hausdorff spaces is that all metric spaces(see Def 2.10) are Hausdorff spaces.

2.2. Basic theory of metric and normed spaces

In this subsection we will consider some of the topological definitions and factsabout metric spaces that we need for the development of the theory. We will alsodefine normed spaces in this section, but the theorems specific to these spaces fitbetter in Section 2.3 about functional analysis.

The exposition of metric spaces will be self-contained, but for a wider introduc-tion I refer the reader to [20], and for the more advanced theory plus introductionto general topology take a look in [21].

Since most of the spaces we shall use are metric some of the topological state-ments below are only formulated for the special case of metric spaces, even thoughthe concept of topological spaces that was introduced in the previous subsectionwould generalize them.

First we recall the definition of a metric space and a metric (distance function).

Definition 2.10 (Metric space).Let X be a space, and let d : X ×X → R be a function that satisfies

d(x, y) ≥ 0 ∀x, y ∈ X and d(x, y) = 0⇐⇒ x = y(1)

d(x, y) = d(y, x) ∀x, y ∈ X (symmetry)(2)

d(x, z) ≤ d(x, y) + d(y, z) ∀x, y, z ∈ X (triangle inequality)(3)

We have the notion of open and closed sets in a metric space.

Definition 2.11.Let (X, d) be a metric space, and let O ⊂ X. We define the open ball with centreat x ∈ X and radius r > 0 by

B(x, r) = y ∈ X | d(y, x) < r.Then the set O ⊂ X is said to be open in the metric space (X, d) if for every pointx ∈ O, there exists a real number r > 0 s.t. the open ball B(x, r) is contained inO.

A set A ⊂ X is said to be closed in the metric space (X, d) if the complementof A, X \A is open.

Then we shall define normed spaces. They differ significantly from the metricspaces in that they are vector spaces. This means that we can add two points of a

Page 6: 1. Intro · The main application of the theory is generating and passing information about fractals, especially the class of self-similar fractals. A fractal is de ned to be a set

6 FREDRIK STROMBERG

space and multiply points by scalars. But later we will also see that every normedspace has a well-defined metric.

Definition 2.12 (Normed space).A Normed space N is a vector space with a norm defined on it. The norm is areal valued function x→ ‖x‖, defined for all x ∈ N , with the following properties

‖x‖ ≥ 0(1)

‖x‖ = 0⇔ x = 0(2)

‖αx‖ = |α|‖x‖(3)

‖x+ y‖ ≤ ‖x‖+ ‖y‖(4)

for all x, y ∈ N and all α ∈ R.

Remark 5.A norm on a space X defines a metric d on X given by

(1) d(x, y) = ‖x− y‖, x, y ∈ X.Many of the metric spaces we encounter are also normed spaces, (cf. Rn where thenorm is the usual absolute value |x|, x ∈ X and the metric is the Euclidean metric,defined by d(x, y) = |x− y|, x, y ∈ X).

Remark 6.Note that all metric spaces have a natural topology defined by the metric, i.e. LetTd = O ⊆ X|O is open in (X, d), then Td is a topology on X, usually calledtopology on X induced by d.

Also note that all normed spaces have a natural topology defined by the metricinduced by the norm.

Remark 7.I repeat the following fact from the last subsection. A metric space X is a Hausdorffspace (see Def 2.9), under the metric topology.

Here come some rather technical looking definitions that are provided for lateruse.

Definition 2.13 (Diameter of a set).Let (X, d) be a metric space and let A ⊆ X, A 6= ∅. Then we define the diameterof A, denoted by diam(A) as

diam(A) = supd(x, y) | x, y ∈ A.(Some authors use the notation |A| for the diameter of A, c.f [10]).

Definition 2.14 (δ -cover).Let (X, d) be a metric space, δ > 0 and let A ⊆ X. The we say that Ai∞i=1 is aδ -cover of A if

Ai ⊆ X, A ⊆∞⋃i=1

Ai and,

diam(Ai) < δ.

Page 7: 1. Intro · The main application of the theory is generating and passing information about fractals, especially the class of self-similar fractals. A fractal is de ned to be a set

ITERATED FUNCTION SYSTEMS , THE CHAOS GAME AND INVARIANT MEASURES 7

Definition 2.15 (ε -net).Let (X, d) be a metric space, B(y, r) = x ∈ X | d(x, y) < r (an open ball withcenter y and radius r), and A ⊆ X. If

A is finite, and(i)

X =⋃a∈A

B(a, ε)(ii)

then we say that A is an ε -net.

Definition 2.16 (Completeness).Let (X, d) be a metric space. If for every sequence xi∞i=1 that has the property

∀ε > 0 ∃Mε > 0 s.t ∀m,n > Mε we have d(xm, xn) < ε.

(i.e. for every Cauchy Sequence)there exists an x ∈ X such that lim

i→∞xi = x, (in the metric d, that is, d(xi, x) −→

i→∞0)

we say that the metric space (X, d) is complete.

Notation 4. A complete normed space is called a Banach space.

Another useful notion of a topological space is compactness.

Definition 2.17 (Compactness).Let X be a topological space, and let A ⊆ X.

If for all Aγγ∈Γ (Γ any index set) s.t.

Aγ open subset of X ∀γ ∈ Γ, and⋃γ∈Γ

Aγ ⊇ A

there exists a finite subset of Γ, Γ = γ1, . . . , γk k <∞ s.t.⋃γ∈Γ

Aγ =k⋃i=1

Aγi ⊇ A,

(that is if every open cover of A there has a finite subcover) we say that A iscompact.

Example 4.Consider a compact metric space (X, d). We fix r > 0, and take as a cover thesets Ax : x ∈ X where Ax = B(x, r) = y ∈ X | d(y, x) < r. That is, we take aset of open balls with a fixed radius and centers in all points from X. Clearly thiscollection covers X, since ∀x ∈ X,x ∈ Ax. The definition now tells us that thereexists N(r) s.t we can cover our space X with only N(r) <∞ balls.

From [21] we get the following theorem, about equivalent definitions of a compactmetric space.

Page 8: 1. Intro · The main application of the theory is generating and passing information about fractals, especially the class of self-similar fractals. A fractal is de ned to be a set

8 FREDRIK STROMBERG

Theorem 1.The following are equivalent

X is compact(a)

X is sequentially compact(b)that is, every sequence in X has a convergent subsequence.

X has the Bolzano - Wierstrass property(c)that is, every infinite subset of X has a limit point.

Remark 8.From (b) above we deduce that every compact metric space is complete, since if asubsequence of a Cauchy sequence converges, the sequence itself must converge.Proof :

Let xi∞i=1 be a C. s. and let xik∞k=1 be a subsequence converging to x ∈ X.Take ε > 0, the convergence =⇒ ∃Mε > 0 s.t. if k > Mε, then d(xik , x) < ε

2 . Sincexi∞i=1 is Cauchy ∃Nε > 0 s.t. If m,n > Nε then d(xm, xn) < ε

2 Thus if we taken, k > max(Nε,Mε) we get d(xn, x) ≤ d(xn, xik) + d(xik , x) < ε

2 + ε2 = ε. Which

means that xn converges to x .

Definition 2.18.A topological space X is said to be locally compact if every point x ∈ X has anopen neighbourhood with compact closure.

Example 5.Rn is locally compact, since all closed balls B(x, r) = B(x, r) = y ∈ Rn | |y−x| ≤

1 are compact.

Example 6.Here come some examples of complete normed and metric spaces that are very fun-damental, and the proof of that these are complete can be found in any elementarybook on functional analysis (e.g. [16]).

(1) Rn ( or Cn), with the norm ‖x‖ = |x| (or the corresponding. inducedmetric)

(2) C[a, b] the space of real (or complex) valued continuous functions definedon the closed interval [a, b], together with the so called supremum norm

‖f‖ = ‖f‖∞ = supx∈[a,b]

|f(x)|.

Note that convergence in this norm is the usual uniform convergence ofcont. functions and as we have learnt in calculus, the uniform limit of asequence of continuous functions is continuous.

Definition 2.19 (Contraction).Let (X, d) be a metric space. A mapping T : X −→ X is called a contraction onX if there is a positive real number α < 1 s.t for all x, y ∈ X

d(Tx, Ty) ≤ α · d(x, y).

The number α is called the contractivity factor of the map T .

Page 9: 1. Intro · The main application of the theory is generating and passing information about fractals, especially the class of self-similar fractals. A fractal is de ned to be a set

ITERATED FUNCTION SYSTEMS , THE CHAOS GAME AND INVARIANT MEASURES 9

Theorem 2 (Banach Fixed Point Theorem).Let (X, d) be a complete metric space, X 6= ∅. Let T : X −→ X be a contractionon X.

Then T has a unique fixed point, xf in X (i.e. ∃! xf ∈ X s.t Txf = xf ).Furthermore , define T 1x = Tx, and T kx = T (T k−1x), k = 2, . . . Then for anyx ∈ X we have lim

n→∞Tnx = xf in the metric d.

For a proof of this theorem see [16] or [2].Another fixed point theorem is the following from [21]

Theorem 3 (The Schauder Fixed Point Theorem).If X is a convex and compact subset of a Banach space B, then every continuousmap from X into X has a fixed point.

There are other important types of maps in metric spaces, for example similtudes,Lipschitz and bi-Lipschitz mappings.

Definition 2.20.Let (X, d) be a metric space, and let S : X → X be a map. If ∃r ∈ R (r > 0) fixeds.t. ∀x, y ∈ X, d(S(x), S(y)) = r · d(x, y), then S is said to be a similtude. (Thenumber r is called the scaling factor of the similtude for obvious reasons.)

Definition 2.21.Let (X, d) be a metric space, and let f : X → X be a map. If ∃c ∈ R (c ≥ 0)fixed s.t. ∀x, y ∈ X, d(f(x), f(y)) ≤ c · d(x, y), then f is said to be a Lipschitzmapping and the constant c is said to be a Lipschitz constant of f .

Notation 5. Sometimes the smallest Lipschitz constant of f is denoted by Lip(f).

Definition 2.22.Let (X, d) be a metric space, and let f : X → X be a map. If ∃c1, c2 ∈ R (0 ≤c1 ≤ c2) fixed s.t. ∀x, y ∈ X, c1d(x, y) ≤ d(f(x), f(y)) ≤ c2d(x, y), then f is saidto be a bi-Lipschitz mapping and the constants c1 and c2 are sometimes calleda lower and an upper Lipschitz constant of f .

Notation 6. Let µr : Rn → Rn be the homothety µr(x) = rx, (r ≥ 0), and let

τb : Rn → Rn be the translation τb = x− b.

That O : Rn → Rn is an orthonormal transformation means that the matrix

that can be associated with it is orthogonal (with determinant = +1, or −1). I.e. Ois a rotation, proper (usual rotion, has det = +1) or improper (involves a reflectionor space inversion component, has det = -1).

In [13] the following proposition about similtudes is proved.

Proposition 1.S : Rn → R

n is a similtude if and only if S = µr τb O, for some homothety µr,translation τb, and orthonormal transformation O.

Page 10: 1. Intro · The main application of the theory is generating and passing information about fractals, especially the class of self-similar fractals. A fractal is de ned to be a set

10 FREDRIK STROMBERG

2.3. Functional analysis

In this section we are going to introduce a few concepts we need from the area offunctional analysis. To get a better overview of this large and interesting subjectsee [16] or [14].

Definition 2.23 (The dual space).Let N be a normed space. Then the set of all bounded linear functionals on Nconstitutes a normed space with the norm given by

‖f‖ = supx∈Nx6=0

|f(x)|‖x‖

= supx∈N‖x‖=1

|f(x)|.

This space is called the dual space of N , and is usually denoted by N ′

About the dual space we know the following (stated and proved in [16]).

Theorem 4 (Dual space).Let N be a normed space. Then N ′ is a complete normed space, that is a Banachspace.

Remark 9.Let N be a normed space, then N ⊆ N ′′. This is easily seen, since if f ∈ N ′, thenφx(f) = f(x) clearly defines a bounded linear functional on N ′.

It was previously stated (Subsection 2.2), that a normed space could be givena topology generated by the metric induced by the norm, but there are othertopologies that also are natural to use in different applications. The idea is to usethe relation of duality.

Definition 2.24 (weak topology).Let N be a normed space. Then the family F = N ′ (where N ′ is the dual ofN) generates a topology on N as in Def. 2.8. This topology is called the weaktopology on N .

Definition 2.25.Let N ′ be the dual of a normed space N . Then the family of functions F =φx : x ∈ N generates a topology on N ′. This topology is called the weak*-topology on N ′.

A proof of the following theorem can be found in [19].

Theorem 5 (The Banach-Alaoglu Theorem).Let B be a Banach space. Then the unit ball f ∈ B′ | ‖f‖ ≤ 1 in B′ is compactin the weak *-topology.

2.4. Measure theory

In order to understand what the fractals really are and what structures they areinhabiting, we will find that these questions are best dealt with if we introducethe concept of a measure. In this chapter the general theory of measures will beintroduced and later we will see how to associate certain measures with IFSs.

Page 11: 1. Intro · The main application of the theory is generating and passing information about fractals, especially the class of self-similar fractals. A fractal is de ned to be a set

ITERATED FUNCTION SYSTEMS , THE CHAOS GAME AND INVARIANT MEASURES 11

For a solid (and formal) introduction to the full theory of measures and integra-tion (+ more advanced material) see [4]. Those that do not want the whole theorybut more examples could consult Barnsley’s introduction to measures in ChapterIX [2].

We will start with some basic definitions of fields and σ-fields. Then we willdefine what a measure is, and state some results.

2.4.1. Fields and Sigma-Fields

A field is a collection of sets that we will use to define a measure on. That is wemust have some sets that the measure can work on.

Definition 2.26.Let X be a space. Let F be a nonempty class of subsets of X s.t

A,B ∈ F =⇒ A ∪B ∈ F(1)

A ∈ F =⇒ X \A ∈ F(2)

X ∈ F(3)

Then F is called a field.

Clearly (1) and (2) imply (3) which also is equivalent to the requirement that∅ ∈ F , but I included it in the definition anyway (see also [4]).

Remark 10.What is here called a field is also sometimes called an algebra (cf. [4]), and thesame goes for the terms σ-field and σ-algebra, defined below.

Now we will define a useful class of fields related to sets of subsets of a space.For a proof of the following theorem see [2].

Theorem 6.Let X be a non-empty set, and let G be a nonempty set of subsets of X. Let FG bethe set of subsets of X which can be built up from finitely many sets in G using theoperations of union, intersection and the complement with respect to X. Then FGis a field.

Definition 2.27.The field FG above is called the field generated by G.

We see from the definition above that a field is a set of subsets that is closed underthe operation of taking union of two sets, and the requirement that the complementis also in the field gives (by the De Morgan rules i.e. X \(A∪B) = (X \A)∩(X \B)and X \ (A ∩ B) = (X \ A) ∪ (X \ B) ) that the intersection of two elements ofthe field is in the field. By induction we can deduce that all finite intersections andunions of elements of a field belongs to the field.

In practical situations we might want to take the union of a countable numberof elements in a field, to make sure that we can do this we restrict ourselves to thefields where this can be done, the so called σ-fields.

Page 12: 1. Intro · The main application of the theory is generating and passing information about fractals, especially the class of self-similar fractals. A fractal is de ned to be a set

12 FREDRIK STROMBERG

Definition 2.28 (σ-field).Let F be a field such that

Ai ∈ F for i = 1, 2, . . .⇒∞⋃i=1

Ai ∈ F .

Then F is called a σ-field (or sigma-field).

Remark 11.Note that given any field, there is always a minimal, or smallest σ-field whichcontains it.

Definition 2.29.Let G be a set of subsets of a non-empty set X. The minimal σ-field that containsthe field generated by G defined in Def 2.27, is called the σ-field generated by G.

Definition 2.30.Let (X, d) be a metric space. Let B denote the σ-field generated by the opensubsets of X. B is called the Borel field associated with the metric space. Anelement of B is called a Borel subset of X.

For the Borel field on a compact metric space we have the following theoremthat is stated and proved in [2].

Theorem 7.Let (X, d) be a compact metric space. Then the associated Borel field B is generatedby a countable set of balls.

In the case of R (and analogously Rn) we have the following proposition aboutwhat sets generate B(R). The proposition is stated and proved in [4].

Proposition 2.The σ-field B(R) of Borel subsets of R is generated by each of the following collec-tion of sets:

(a) the collection of all closed subsets of R ;(b) the collection of all subintervals of R of the form (−∞, b];(c) the collection of all subintervals of R of the form (a, b].

Where −∞ ≤ a < b ≤ ∞.

2.4.2. Measures

Let X be a set, and let F be a σ-field on X. Let µ : F → [0,+∞]. We now havethe following definitions.

Definition 2.31.The function µ is said to be countably additive if it satisfies

µ

( ∞⋃i=1

Ai

)=∞∑i=1

µ(Ai)

for each infinite sequence Aii≥1 ⊂ F , of disjoint sets.

Page 13: 1. Intro · The main application of the theory is generating and passing information about fractals, especially the class of self-similar fractals. A fractal is de ned to be a set

ITERATED FUNCTION SYSTEMS , THE CHAOS GAME AND INVARIANT MEASURES 13

Definition 2.32 (measure).If µ is countably additive, and also satisfies µ(∅) = 0, then µ is said to be measure(or a countably additive measure) on F .

Notation 7. If X is a set, if F is a σ-field on X, then the pair (X,F) is called ameasurable space, and if µ is a measure on F , then the triple (X,F , µ) is oftencalled a measure space.

Notation 8. A measure on B(Rn), is called a Borel measure.

Example 7.Let X be an arbitrary set, and let F be a σ-field on X. Define a function µ :X → [0,+∞] by letting µ(A) = n if A is a finite set with n elements, and lettingµ(A) = +∞ if A is an infinite set. Then µ is a measure, usually called countingmeasure on (X,F).

Definition 2.33.Let X be a set, and let F be a σ-field on X. Then ν : F → [−∞,+∞] is calleda signed measure (or real measure) if if ν is countably additive in the samesense as above. A signed measure is finite if neither +∞ or −∞ occurs among itsvalues.

The only thing that differs between measures and signed measures is the re-quirement that measures only attains non-negative values. Therefore sometimesone calls measures positive measures.

The connection between signed and positive measures is given by the JordanDecomposition Theorem, stated and proved in [4].

Theorem 8 (Jordan Decomposition Theorem).Every signed measure is the difference of two positive measures, at least one ofwhich is finite.

Remark 12.One usually writes the so called Jordan decomposition of a measure µ as : µ =µ+−µ−, where µ+ and µ− are positive measures, and are called the positive partand the negative part of µ.

Definition 2.34.The variation of the signed measure µ is the positive measure |µ| defined by|µ| = µ+ + µ−.

Definition 2.35.If (X,F , µ) is a measure space, and A ⊆ X is s.t. µ(A) = 0, then A is said to beµ-negligible or of measure zero (with respect to µ).

Definition 2.36.If (X,F , µ) is a measure space, then a property that holds for the points from aset B ⊂ X is said to hold µ- almost everywhere or µ a.e. if the set X \ B isµ-negligible.

Page 14: 1. Intro · The main application of the theory is generating and passing information about fractals, especially the class of self-similar fractals. A fractal is de ned to be a set

14 FREDRIK STROMBERG

Definition 2.37 (mass).Let µ be a measure on X, then we define the mass of µ as µ(X).

Definition 2.38.If µ is a measure on a space X, then we say that µ is finite if µ(X) < +∞.

Definition 2.39.The total variation ‖µ‖ of the signed measure µ is defined by‖µ‖ = |µ|(X).

From [4] we get the following proposition.

Proposition 3.Suppose that (X,F) is a measurable space. Let M(X,F ,R) be the collection of allfinite signed measures on (X,F). Then M(X,F ,R) is a vector space over R, andthe total variation gives a norm on this space. Furthermore this normed space iscomplete.

Notation 9. We will denote by B(X,F ,R) the vector space of all bounded real-valued F-measurable functions on X.

The most important measure on Rn is the so called Lebesgue measure, which wewill construct now, in the same way as in [4]. A standard technique for constructingmeasures is to start with outer measures, which are like measures, except that theyare defined on all subsets.

Definition 2.40 (outer measure).Let X be a set, and let 2X denote the set of all subsets of X. An outer measureon X is a function µ∗ : 2X → [0,+∞] such that

(1) µ∗(∅) = 0,(2) if A ⊂ B ⊂ X, then µ∗(A) ≤ µ∗(B), and(3) if Ann≥1 is an infinite sequence of subsets of X, then

µ∗

(⋃n≥1

An

)≤∑n≥1

µ∗(An).

Note that a measure can fail to be an outer measure.

Definition 2.41.Let X be a set, and let µ∗ be an outer measure on X. A subset B of X is calledµ∗-measurable (or measurable with respect to µ∗) if

µ∗(A) = µ∗(A ∩B) + µ∗(A ∩ (X \B))

holds for each subset A of X.

In [4] we also have the following proposition stated and proved, that tells us onwhich subsets we can expect an outer measure to be a measure.

Proposition 4.Let X be a set, let µ∗ be an outer measure on X, and let Mµ∗ be the collection ofµ∗-measurable subsets of X, then

Page 15: 1. Intro · The main application of the theory is generating and passing information about fractals, especially the class of self-similar fractals. A fractal is de ned to be a set

ITERATED FUNCTION SYSTEMS , THE CHAOS GAME AND INVARIANT MEASURES 15

(1) Mµ∗ is a σ-field, and(2) the restriction of µ∗ on Mµ∗ is a measure on Mµ∗ .

Now we can define first the Lebesgue outer measure, which then will be restrictedto the Lebesgue measure.

The construction is done in Rn, and so we first need to know what a n-dimensionalinterval (or cube) is.

Definition 2.42.The set A ⊆ Rn is an n-dimensional interval if A is of the form A = I1×· · ·×In,where I1, . . . , In are intervals in R, finite or infinite, closed, open or neither closednor open. The volume of the n-dimensional interval I1 × · · · × In is as expectedthe product of the lengths of the intervals I1 . . . In, and will be denoted by vol(I1×· · · × In).

Definition 2.43.Let A ⊂ X, and let CA be the set of all sequences Rii≥1 of bounded and openn-dimensional intervals for which A ⊂

⋃i≥1

Ri. Then the Lebesgue outer measure

on Rn, denoted by λ∗ (or λ∗n) is defined by

λ∗(A) = inf

∞∑i=1

vol(Ri) : Rii≥1 ∈ CA

.

We have the following propositions proved in [4].

Proposition 5.The Lebesgue outer measure on Rn, λ∗ is an outer measure, and it assigns to eachn-dimensional interval its volume.

Notation 10. We will denote the collection of Lebesgue measurable subsetsby Mλ∗ .

Definition 2.44 (Lebesgue measure).The restriction of the Lebesgue outer measure on Rn to the set Mλ∗ , is called theLebesgue measure on Rn, and is denoted by λ (or λn).

The following propositions proved in [4] gives us some knowledge of the Lebesguemeasure in Rn.

Proposition 6.Every Borel subset of Rn is Lebesgue measurable. In particular, λn‖B(Rn) is aBorel measure.

Proposition 7.Lebesgue measure is the only measure on (Rn,B(Rn)) that assigns to each n-dimensional interval, its volume.

Page 16: 1. Intro · The main application of the theory is generating and passing information about fractals, especially the class of self-similar fractals. A fractal is de ned to be a set

16 FREDRIK STROMBERG

Definition 2.45 (regular).Let F be a σ-field on Rn that includes B(Rn). A measure µ on (Rn,F) is regularif

(1) each compact subset K of Rn satisfies µ(K) < +∞,(2) each set A in F satisfies µ(A) = infµ(U) : A ⊂ U and U is open, and(3) each open subset U of Rn satisfies

µ(U) = supµ(K) : K ⊂ U and K is compact.Now another proposition from [4] about λ.

Proposition 8.The Lebesgue measure on (Rn,Mλ∗), is regular.

We can also define support for regular measures, after the following proposition,that is stated and proved for Hausdorff spaces (see Def 2.9) in [4].

Proposition 9.Let X be a locally compact metric space, and let F be a σ-field on X that includesB(X), and let µ be a regular measure on (X,F). Then the union of all the opensubsets of X that have measure zero under µ is itself an open set that has measurezero under µ.

Definition 2.46.Let X,µ and F be as in Prop. 9, and let O =

⋃µ(U)=0

U, where U open subset of X.

Then the support of µ, denoted by supp(µ) is defined by

(2) supp(µ) = X \O.

2.4.3. Integration

In this subsection we will learn how to integrate functions with respect to measures.Most of the propositions and definitions are taken from [4].

Proposition 10.Let (X,F) be a measurable space (see not 7), and let A ∈ F . For a functionf : A→ [−∞,+∞] the following conditions are equivalent

(1) ∀t ∈ R, the set x ∈ A : f(x) ≤ t ∈ F .(2) ∀t ∈ R, the set x ∈ A : f(x) < t ∈ F .(3) ∀t ∈ R, the set x ∈ A : f(x) ≥ t ∈ F .(4) ∀t ∈ R, the set x ∈ A : f(x) > t ∈ F .

Definition 2.47.Let (X,F) be a measurable space, and let A ∈ F . A function f : A → [−∞,+∞]is measurable with respect to F if it satisfies one and hence all conditions ofProp 10. If F = B(Rn), then the function is called Borel measurable or aBorel function.

Definition 2.48.A function is called simple if it has only finitely many values. If (X,F) is a

Page 17: 1. Intro · The main application of the theory is generating and passing information about fractals, especially the class of self-similar fractals. A fractal is de ned to be a set

ITERATED FUNCTION SYSTEMS , THE CHAOS GAME AND INVARIANT MEASURES 17

measurable space, then we denote the space of all real-valued simple F-measurablefunctions by S, and all non-negative functions in S by S+.

Here are some examples of measurable functions.

Example 8. (1) If f : Rn → R is continuous, then f is Borel measurable.(2) The characteristic function χA is measurable with respect to F ⇐⇒ A ∈ F .(3) If I is a subinterval of R, and f : I → R is non-decreasing, then f is Borel

measurable.(4) Let (X,F) be a measurable space, let f : X → [−∞,+∞] be simple , and

let α1, . . . , αn be the values of f . Then f is F measurable if and only ifx ∈ X : f(x) = αi ∈ F holds for i = 1, . . . , n.

The following proposition (from [4]) is also useful.

Proposition 11.Let (X,F) be a measurable space, let A ∈ F , and let fnn≥1 be a sequence of[−∞,+∞]-valued measurable functions on A. Then

(a) the functions lim supn→∞

fn and lim infn→∞

fn are measurable,

(b) the functions supn∈N

fn and infn∈N

fn are measurable,

(c) the function limn→∞

fn is measurable.

Now we can start defining the integral for simple functions.

Definition 2.49.Let (X,F , µ) be a measure space. If f ∈ S+, and is given by f =

∑ni=1 aiχAi , where

a1, . . . , an are non-negative real numbers, and A1, . . . , An are disjoint subsets of X,that belongs to F , then

∫fdµ , the integral of f with respect to µ is defined by

(3)∫fdµ =

n∑i=1

aiµ(Ai).

Definition 2.50.Let (X,F , µ) be a measure space, let f be an arbitrary [0,+∞]-valued F-measurablefunction on X. We define the integral of f by∫

fdµ = sup∫gdµ : g ∈ S+, and g ≤ f.

Now we have a proposition that tells us how to integrate non-negative measurablefunctions.

Proposition 12.Let (X,F , µ) be a measure space, let f be a [0,+∞]-valued F-measurable functionon X, and let fn∞n=1 be a non-decreasing sequence of functions in S+ for whichf(x) = lim

n→∞fn(x) holds at each x ∈ X. Then

(4)∫fdµ = lim

n→∞

∫fndµ.

Page 18: 1. Intro · The main application of the theory is generating and passing information about fractals, especially the class of self-similar fractals. A fractal is de ned to be a set

18 FREDRIK STROMBERG

2.4.4. Some relations between spaces of functions and spaces of measures

Notation 11. If (X,F , µ) is a measure space, and f is a F-measurable function,we will write

µ(f) =∫X

fdµ

Remark 13.We clearly have that if µ ∈M(X,F ,R), then the formulaf → µ(f) defines a linear functional on B(X,F ,R), and if f ∈ B(X,F ,R) theformula µ→ µ(f) defines a linear functional on M(X,F ,R).

Notation 12. Let X be a locally compact Hausdorff space. We will denote byMr(X,R) the set of all finite signed regular Borel measures on X. Clearly Mr(X,R)is a closed subspace of M(X,B,R), so Mr(X,R) is also a Banach space under thetotal variation norm.

Definition 2.51.Let X be a locally compact Hausdorff space, and let f be a continuous real-valuedfunction on X. Then f is said to vanish at infinity if for each ε > 0, there is acompact subset K of X such that |f(x)| < ε, ∀x 6∈ K.

Notation 13. Let CB(X) denote the space of all bounded continuous real-valuedfunctions on X.

Notation 14. Let CV(X) be the space of all continuous real-valued functions onX, that vanish at infinity.

Definition 2.52.Let X be any space, and let f : X → R be a function. The support of f , denotedby supp(f) is defined as the closure of the set x ∈ X | f(x) 6= 0.

Notation 15. Let X be a topological space and let C0(X) be the space of allcontinuous real-valued functions on X, that have compact support (i.e. supp(f) iscompact in X).

Now we clearly see that C0(X) ⊆ CV(X) ⊆ CB(X).

Remark 14.We equipp the vector spaces CB(X), CV(X) and C0(X) with the so called supremumnorm,

‖f‖ = ‖f‖∞ = supx,y∈X

|f(x)− f(y)|, for f in any of the above spaces.

And whenever we consider any of these spaces or subspaces of them we will assumethat they are normed spaces with this norm.

Furthermore the following proposition proved in [4] tells us how we can approx-imate functions that vanish at infinity.

Page 19: 1. Intro · The main application of the theory is generating and passing information about fractals, especially the class of self-similar fractals. A fractal is de ned to be a set

ITERATED FUNCTION SYSTEMS , THE CHAOS GAME AND INVARIANT MEASURES 19

Proposition 13.Let X be a locally compact Hausdorff space. Then C0(X) are a dense subspace ofCV(X).

[4] also proves the following proposition.

Proposition 14.Let X be a locally compact Hausdorff space. Then CV(X) is a Banach space.

Now comes the theorem that most of this section has been aiming at, about theconnection between certain functions and certain measures. This theorem is givenand proved as Thm 7.3.5 in [4].

Theorem 9 (The Riesz Representation Theorem).Let X be a locally compact Hausdorff space. Then the map that takes µ ∈Mr(X,R)to the functional f →

∫fdµ is an isometric isomorphism of the Banach space

Mr(X,R) onto the dual of CV(X), the space of all real-valued continuous functionson X that vanishes at infinity.

Remark 15.This theorem thus say that the space of finite regular signed Borel measures onX, Mr(X,R) and the dual of the space of functions on X that vanishes at infinity,CV(X)′ are isometrically isomorphic. Which means that they are in some sense thesame space.

Notation 16. If N is a normed space with dual space N ′ and if N ′ is isometricallyisomorphic to the space M then it is usual to say that M is the dual of N . Hencewe will say that Mr(X,R) is the dual of CV(X).

2.4.5. Probability measures

Definition 2.53.Let (X, d) be a separable metric space. A probability measure on X is a finiteregular Borel measure µ on X s.t. µ(X) = 1. The probability measures we willconsider here must satisfy a condition that is not too restrictive in practice, butthat still is nececcary for the proofs.

If a is any fixed point in X we define a linear subspace of probability measuresby P(X) = µ |

∫Xd(x, a)dµ(x) <∞

Remark 16.Note that in the case X = R, a = 0 this is means that the random variables definedby the measure have finite first moment.

Remark 17.The explicit condition I put above on the probability measures is used in for example[23], and in a more general case when the integral is not of d(x, a) but some otherfunction c in [17].

Hutchinson require that the probability measures should have bounded support,but then the condition above is of course satisfied. Barnsley in [2]) do not require

Page 20: 1. Intro · The main application of the theory is generating and passing information about fractals, especially the class of self-similar fractals. A fractal is de ned to be a set

20 FREDRIK STROMBERG

that the probability measures have a bounded support, but he is working in acompact metric spaces, so the supports are bounded automatically.

If (X, d) is a separable metric space we can turn P(X) into a metric space withthe so called Hutchinson metric dH .

Definition 2.54.Let (X, d) be a separable metric space, and let µ, ν ∈ P(X), then define dH by

(5) dH(µ, ν) = supµ(f)− ν(f) : f ∈ Lip1(X)where

LipL(X) = f : X → R, continuous, and |f(x)− f(y)| ≤ L · d(x, y), ∀x, y ∈ Xis the set of all Lipschitz-continuous functions from X to R, with the Lipschitzconstant L.

Remark 18.What we call the Hutchinson metric here is also called the (Vasserstein) Wassersteinmetric, or the Kantorovich metric (in [17]). There are a lot of other metrics that canbe introduced on this space (Kolmogorov, Levy metric etc.), but the Hutchinsonmetric is the best suited for the IFS theory. For further theory of probabilitymetrics, see [17].

Remark 19.If µ, ν ∈ P(R) have distribution functions Fµ, Fν then

dH(µ, ν) =∫ ∞−∞|Fµ(x)− Fν(x)|dx.

This is used as a definition in [17].

In order to prove that P(X) is a metric space together with the Hutchinson metricwe need som result concerning duality between P(X) and the Lipschitz functions.Except that we need to consider a narrower space of measures we will arrive atresults similar to Theorem 9.

The below remark is a simplification of section 5.3 p108-114 in [17], and thedetails can be found there.

Remark 20.Let (X, d) be a separable metric space.

(1) First we must know which functions we consider. If f : X → R define

‖f‖d = supx6=y |f(x)− f(y)|

d(x, y),

and set L = f : ‖f‖d < ∞. Then ‖ · ‖d is a seminorm on L and‖f‖d = 0 if and only if f is a constant function. Define L0 to be the quotientspace of L modulo the constant functions, i.e. L0 = L/ ∼, where ∼ is anequivalence relation, f ∼ g ⇐⇒ f(x) − g(x) = c. The elements in L0 areequivalence classes (denoted by [f ]), but calculations will be performed onrepresentatives and we will write f instead of [f ]. Note that ‖f‖d coincideswith Lip(f) if f is Lipschitz continuous, and specifically, Lip1(X) can beseen as the unit ball in L0.

Page 21: 1. Intro · The main application of the theory is generating and passing information about fractals, especially the class of self-similar fractals. A fractal is de ned to be a set

ITERATED FUNCTION SYSTEMS , THE CHAOS GAME AND INVARIANT MEASURES 21

(2) Now we must find a measure space to match the function space above. Leta ∈ X be fixed and define M0 = M0(X) ⊂Mr(X,R) as the linear space offinite signed borel regular measures m on X that satisfies

m(X) = 0, and∫X

d(a, x)d|m| <∞.

For each m ∈M0 define B(m) as the set of all finite measures b on X ×Xs.t.

b(A×X)− b(X ×A) = m(A), for all Borel sets A ⊆ X.

Define a function m 7→ ‖m‖w on M0by

‖m‖w = inf∫X×X

d(x, y)b(dx, dy) | b ∈ B(m).

Then this is a seminorm on M0 and via the dual space of M0 it will beshown that this norm gives us the Hutchinson metric.

Now for f ∈ L and m ∈M0 we see that

|f(x)| ≤ |f(x)− f(a)|+ |f(a)| ≤ ‖f‖dd(x, a) + |f(a)| = K1 · d(x, a) +K2,

so f is |m|-integrable, and induces a linear map φf : M0 → R by

φf (m) =∫fdm.

Since m(X) = 0 we see that if f ∼ g the φf = φg.Rachev shows that the linear map D : (L0, ‖ · ‖d)→ (M′0, ‖ · ‖

w) definedby D(f) = φf is isometric isomorphism.

Then consider the dual operator D′ of D, D′ : (M′′0 , ‖ · ‖′′

w)→ (L′

0, ‖ · ‖′

d)which is also an isometric isomorphism, and the natural isometric isomor-phism T from (M0, ‖·‖w) into its second dual (M′′0 , ‖·‖

′′

w) So the compositionD′ T is an isometry of (M0, ‖ · ‖w) onto (L

0, ‖ · ‖′

d) and from the definitionof the dual operator we have D′m(f) = m(Df) = φf (m) =

∫Xfdm, hence

the norm in M0 is given by

‖m‖w = ‖D′ T (m)‖′

d = ‖D′m‖′

d = supD′m(f) | ‖f‖d ≤ 1 =

= sup∫fdm | ‖f‖d ≤ 1

Lemma 1. Let m ∈M0, then

‖m‖w = sup

∫fdm | ‖f‖d ≤ 1

,

and there exists f ∈ L with ‖f‖d = 1 s.t. ‖m‖w =∫fdm.

Proof.The Hanh-Banach Theorem (see [21]) tells us that there exists a linear functionalφ ∈ M′0 s.t. φ(m) = ‖m‖w ‖φ‖

w = 1, and since (M′0, ‖ · ‖′

w) is isometricallyisomorphic with (L0, ‖ · ‖d) φ = φf for some f ∈ L with ‖f‖d = ‖φ‖′w = 1.

Page 22: 1. Intro · The main application of the theory is generating and passing information about fractals, especially the class of self-similar fractals. A fractal is de ned to be a set

22 FREDRIK STROMBERG

The following Lemma is similar to a result in [12].

Lemma 2. Let (X, d) be a separable metric space, and let µ be a finite regular Borelmeasure. Then for any B ∈ B(X), and all ε > 0 there exists a function f ∈ L s.t.

|µ(f)− µ(B)| = |∫X

fdµ− µ(B)| < ε.

Proof.We know that µ is regular (from def.) and this implies that there exists an openset U , B ⊂ U ⊂ X s.t. |µ(B)− µ(U)| < ε

2 . Now define fk : X → R by

fk(x) = min(k · d(x,X \ U), 1), x ∈ X.

Now fk ∈ L and fk(x) ≤ fk+1(x),∀k and fk(x) →k→∞

χU (x), the indicator function

of U . The monotone convergence theorem (c.f. [4]) implies that

limk→∞

∫X

fkdµ =∫X

χUdµ = µ(U),

and hence there exists N < 0 s.t. |∫Xfkdµ− µ(U)| < ε

2 , k ≥ N . Now fN ∈ L, and

|∫X

fNdµ− µ(B)| ≤ |µ(fN )− µ(U)|+ |µ(U)− µ(B)| < ε.

Proposition 15.If µ, ν are finite regular Borel measures and µ(f) = ν(f) for all f ∈ Lip(X) thenµ = ν.

Proof.Let m = µ− ν and apply the preceeding Lemma to m. Let B be any Borel set andlet ε < 0 be arbitrary, then there exists f ∈ L s.t.

|m(B)−m(f)| < ε,

and if f ∈ L, then g = f‖f‖d ∈ Lip(X) so |m(f)| = ‖f‖d|m(g)| = 0 =⇒ |m(B)| < ε

and since ε was arbitrary > 0 m(B) = 0, hence µ(B) = ν(B). So µ and ν agree forall Borel sets in X and hence they are equal.

Now we are ready to prove that the Hutchinson metric really gives us a metric.

Theorem 10.Let (X, d) be a separable metric space. Then (P(X), dH) is also a metric space.

Proof. (1) Clearly f ≡ 1 ∈ Lip1, and µ(1) − ν(1) = µ(X) − ν(X) = 0. SodH(µ, ν) ≥ 0.

(2) Now we must prove that if dH(µ, ν) = 0, then µ = ν.Assume dH(µ, ν) = 0. This implies that µ(f) = ν(f), for all f ∈ Lip(X).The preceeding Corollary thus implies that µ = ν.

(3) Next thing to prove is that dH is symmetric, i.e. dH(µ, ν) = dH(ν, µ).

Page 23: 1. Intro · The main application of the theory is generating and passing information about fractals, especially the class of self-similar fractals. A fractal is de ned to be a set

ITERATED FUNCTION SYSTEMS , THE CHAOS GAME AND INVARIANT MEASURES 23

Observe that f ∈ Lip1(X)⇔ −f ∈ Lip1(X), so we get

dH(µ, ν) = supµ(f)− ν(f) : f ∈ Lip1(X)= supµ(−f)− ν(−f) : f ∈ Lip1(X)= supν(f)− µ(f) : f ∈ Lip1(X)= dH(ν, µ).

(4) to satisfy, for any a ∈ X,∫X

d(x, a)dµ(x) <∞, and∫X

d(x, a)dν(x) <∞.

If φ ∈ Lip1(X) we have

µ(φ)− ν(φ) = µ(φ− φ(a) + φ(a))− ν(φ− φ(a) + φ(a))

= µ(φ− φ(a))− ν(φ− φ(a))

(since µ(φ(a)) = φ(a)µ(X) = φ(a) = φ(a)ν(X) = ν(φ(a)))

=∫X

(φ(x)− φ(a))dµ(x)−∫X

(φ(x)− φ(a))dν(x)

≤∫X

d(x, a)dµ(x) +∫X

d(x, a)dν(x) <∞.

Definition 2.55.Instead of just looking at P(X) as a metric space with the metric topology, we canintroduce another topology on P(X), which is sometimes more useful. The usualtopology to consider here is the weak *-topology, generated by the subbase (c.fDef 2.5) consisting of all sets of the form : Ua,b,φ = µ : a < µ(φ) < b, fora, b ∈ R, and φ ∈ CB(X) When we talk about the weak *-topology on P(X) it isthis topology that we mean.

Remark 21.Convergence of probability measures in the weak *-topology is indicated by a w∗over the arrow, and we have the following equivalence

µnw∗−→

n→∞µ0 ⇐⇒ |µn(f)− µ0(f)| −→

n→∞0, ∀ f ∈ CB(X).

Under some circumstances it is possible to prove that the weak *-topology andthe metric topology induced by dH are equivalent on some subspaces of P(X).

Proposition 16.If X is a complete and locally compact metric space, then P(X) is compact in theweak *-topology, generated by CB(X).

Proof.If X is any complete and locally compact metric space, we have that CB(X) is anormed space, with the supremum norm,

‖f‖ = supx∈X|f(x)| , f ∈ CB(X).

Page 24: 1. Intro · The main application of the theory is generating and passing information about fractals, especially the class of self-similar fractals. A fractal is de ned to be a set

24 FREDRIK STROMBERG

Then the dual CB(X)′ is a Banach space (see [14]), with norm given by

‖µ‖ = sup‖f‖≤1

µ(f) = ‖µ‖(X).

This is the total variation norm introduced in Prop. 3. Theorem 9 implies thatCV(X)′ is isometrically isomorphic with Mr(X,R), and CV(X) ⊂ CB(X) =⇒CB(X)′ ⊂ CV(X)′, so CB(X)′ is a complete subspace of Mr(X,R). It is alsoclear that the functional from CB(X) to R defined by

∫Xfdµ, where µ ∈ P(X)

belongs to CB(X)′. Thus P(X) ⊂ CB(X)′.

P(X) = µ ∈Mr(X,R) : ‖µ‖ = 1⊂ µ ∈Mr(X,R) : ‖µ‖ ≤ 1

So P(X) is a closed subset of the closed unit ball in Mr(X,R). Now the Banach-Alaoglu Theorem (Theorem 5) tells us that the closed unit ball in Mr(X,R) iscompact in the weak *-topology, hence P(X) is also compact in the weak *-topology.

The following theorem gives us an explicit example of when the weak *-topologyand the Hutchinson metric topology are equivalent.

Remark 22.Note that if (X, d) is a separable metric space and µ, ν ∈ P(X) then m = µ−ν ∈Msince

(1) m(X) = µ(X)− ν(X) = 1− 1 = 0, and(2) if a ∈ X is fixed, then∫

X

d(x, a)d|m| =∫X

d(x, a)d(µ+ ν) <∞,

by the requirements on P(X)Also note that dH(µ, ν) = ‖µ − ν‖, where we now know that there exists a

function f ∈ Lip1 s.t. dH(µ, ν) =∫Xfd(µ− ν).

All this will be used in the following theorem.

Theorem 11.Let (X, d) be a compact metric space. Then, on the space of probability measures,P(X), the weak *-topology is equivalent with the metric topology generated by theHutchinson metric, i.e. The Hutchinson metric metrizes the weak *-topology.

Proof.

Claim 11.1.The weak *-topology on P(X) is metrizable.

Proof of Claim 11.1.From Doob, [5] we get that on P(X), the weak *-topology is equivalent with ametric topology given by the metric dM that is defined by

dM (µ, ν) =∞∑n=1

12n

|µ(fn)− ν(fn)|1 + |µ(fn)− ν(fn)|

, µ, ν ∈ P(X),

Page 25: 1. Intro · The main application of the theory is generating and passing information about fractals, especially the class of self-similar fractals. A fractal is de ned to be a set

ITERATED FUNCTION SYSTEMS , THE CHAOS GAME AND INVARIANT MEASURES 25

where fn∞n=1 is a countable dense subset of BC(X) (the space BC(X) is firstshown to be separable).

Claim 11.2.If ν, µ ∈ P(X) then there exists f0 ∈ Lip1(X) s.t.

dH(µ, ν) = supµ(f)− ν(f) : f ∈ Lip1(X) = µ(f0)− ν(f0).

Proof of Claim 11.2.See Remark 22.

Claim 11.3.If µn, µ0 ∈ P(X) for n = 1, . . . then

µn → µ0 in the weak *-topology ⇐⇒ dH(µn, µ0) →n→∞

0.

Proof of Claim 11.3.Define mn = µn−µ0, then mn ∈M and by the remark above dH(µn, µ0) →

n→∞0 if

and only if ‖mn‖w →n→∞

0. And to say that µn converges in the weak-* topoloogy

to µ0 is equivalent to say that mn converges to 0 in the weak-* topology.But the weak-* topology onM is easily recognized when we identifyM with L

0

via the map D′ T from the remark. Since now M is considered as the dual of L0

the weak-* topology is generated by the elements of L0.We now identify the equivalence classes in L0 with the representatives in L

since m(f) is constant on these equivalence classes. Since convergence for all f ∈L implies convergence for the f with ‖f‖d ≤ 1 the weak-* convergence impliesconvergence in the Hutchinson metric. That is

mnw−∗→n→∞

0

⇐⇒|mn(f)| →

n→∞0, ∀f ∈ L

=⇒ |mn(f)| →n→∞

0, ∀f ∈ Lip1(X)

And if g ∈ L then f = g‖g‖d ∈ Lip1(X), so

‖mn‖w →n→∞

0

⇐⇒|mn(f)| →

n→∞0, ∀f ∈ Lip1(X)

=⇒

if g is any function in L then with f = g‖g‖d ∈ Lip1(X), we get by the linearity of

the integral that

|mn(g)| = ‖g‖d|mn(f)| →n→∞

0, ∀f ∈ Lip1(X)

so convergence in the Hutchinson metric implies convergence in the weak-* topology.

Page 26: 1. Intro · The main application of the theory is generating and passing information about fractals, especially the class of self-similar fractals. A fractal is de ned to be a set

26 FREDRIK STROMBERG

In a metric topology a set is closed if and only if it is sequentially closed, thatis all convergent sequences converges to a point in the set. From Claim 11.1 weknow that the weak-* topology is metrizable, so Claim 11.3 implies that the setswhich are closed in the Hutchinson metric topology are also closed in the weak-*topology. And since a topology is determined by the closed sets the two topologiesmust be equal.

2.5. Two important metric spaces

There are two metric spaces that have a deep connection with our theory of IFSs.That is first the space of compact subsets of X, which is important because thatis the space to which the fractals belong. Second is the space of infinite sequencesof integers from the set 1, . . . , N. Why this space is important will take a fewsections to see.

2.5.1. The space H(X) of compact non-empty subsets of X

Above all, there is one space besides X that is of great importance in the IFStheory, that is the space of non-empty, compact subsets of X, H(X). Formally wehave the following definition.

Definition 2.56 (H(X)).The space H(X) is defined by

(6) H(X) = A ⊆ X | A is compact, and 6= ∅

Now we can use that the underlying space X is equipped with a metric d. (seeDef 2.10) to define a metric h, (the Hausdorff metric) on H(X).

Definition 2.57 (Hausdorff metric).We define the Hausdorff metric on H(X) by

(7) ∀A,B ∈ H(X) we have h(A,B) = maxmaxx∈A

(dp(x,B)),maxx∈B

(dp(y,A))

where dp(w,C) = minz∈C

(d(w, z)) (the distance from x to C).

Now we have a metric space (H(X), h), and in [2] the following proposition isproved

Proposition 17.

(X, d) complete⇒ (H(X), h) is complete(i)

(X, d) compact⇒ (H(X), h) is compact(ii)

2.5.2. The space Ω(N)

Definition 2.58.Let N ∈ N, N ≥ 2 and fixed, then we define the space Ω(N) by

Ω(N) =∞∏p=1

1, . . . , N

Page 27: 1. Intro · The main application of the theory is generating and passing information about fractals, especially the class of self-similar fractals. A fractal is de ned to be a set

ITERATED FUNCTION SYSTEMS , THE CHAOS GAME AND INVARIANT MEASURES 27

=

ω = (i1, i2, . . . )

∣∣∣∣∣ ij ∈ 1, . . . , N for j = 1 . . .∞.

,

and we call this space the code space on N symbols.

We will also have use for the set of finite sequences of length k obtained bycutting of a sequence from Ω(N), so we give this set a name.

Definition 2.59.Let N ∈ N, N ≥ 2 and fixed, then we define the space Ωk(N) by

Ωk(N) =

(i1, . . . , ik)

∣∣∣∣∣ ij ∈ 1, . . . , N

Remark 23.Ω(N) is given a lot of different names in the literature but we follow Barnsley in [2]and call it the code space on N symbols. Hutchinson [13] calls it the Cantor set onN symbols, which is also appropriate, since for example Ω(2) is isomorphic with theclassical ’middle third’ Cantor set. (That is there is a one to one correspondencebetween points in Ω(2) and the Classical Cantor set), and likewise for other N.

Notation 17. When N is clear from the context as when we are working with anIFS with N maps we write Ω, and Ωk for Ω(N), and Ωk(N) respectively.

Now we are going to make Ω(N) a metric space.

Theorem 12.Let N ≥ 2 and ω, ω′ ∈ Ω(N) and define

(8) dΩ(ω, ω′) = d(x1, x2, . . . , y1, y2, . . . ) =∞∑i=1

|xi − yi|(N + 1)i

Then (Ω(N), dΩ) is a compact (and therefore complete) metric space.

For a proof see [3].

3. Development of the IFS theory

3.1. Basic notions of IFSs

First we will define what an IFS is, and some variations thereof.

Definition 3.1 (IFS).Let (X, d) be a complete metric space, and let S = Si1≤i≤N , N < ∞ be a finitecollection of continuous maps on X.

Then we call (X,S) an Iterated Function System, or an IFS.

Definition 3.2 (Contractive IFS).Let (X, d) be a complete metric space. and let S = Si1≤i≤N , N < ∞ be aset of contractions on X, i.e. for each 1 ≤ i ≤ N we have that d(Six, Siy) ≤sid(x, y) ∀x, y ∈ X and some si ∈ [0, 1).

Then we call (X,S) a contractive IFS, (or a hyperbolic IFS), and we saythat this IFS has a contractivity factor s = maxs1, . . . , sn.

Page 28: 1. Intro · The main application of the theory is generating and passing information about fractals, especially the class of self-similar fractals. A fractal is de ned to be a set

28 FREDRIK STROMBERG

Notation 18. Throughout this text, unless anything else is stated, we will let theletter N stand for the number of maps in the IFS (X,S) under consideration.

Definition 3.3 (Probabilistic IFS).Let (X,S) be an IFS, and let ~p = (p1, . . . , pN ) be an N-dimensional probabilityvector, that is pi ∈ [0, 1] ∀i and

∑1≤i≤N

pi = 1.

Then we call (X,S, ~p) an Probabilistic IFS (or IFS with probabilities).

The above terminology is standard (cf. [15] and [2]).In practice one almost always works with contractive IFSs.

Definition 3.4 (Hutchinson operator).We define the Hutchinson operator W on H(X) by

(9) W (A) =N⋃i=1

Si(A) for A ∈ H(X)

Notation 19. Since we will later iterate the Hutchinson operator W we will usethe following notation

W 0(A) = A,W 1(A) = W (A), and

W k(A) = W (W (k−1)(A)), for k ≥ 2.

We will sometimes use the notation

Si1,i2,...,ip = Si1 Si2 · · · Sip , for Sij ∈ S, andAi1,i2,...,ip = Si1,i2,...,ip(A), for A ⊆ X.

Remark 24.In connection with the notation above we can see (cf. [13]) that

W p(A) =⋃

i1,i2,...,ip ∈1,...,N

Ai1,i2,...,ip .

3.2. Invariants of an IFS

In this section we will find out really what the connection is between the IFS theoryand the fractals.

3.2.1. Invariant set

Definition 3.5 (Invariant set).We say that a set A ⊆ X is invariant with respect to the IFS (X,S), if W (A) = A,where W is as defined above, the Hutchinson operator. So to say that a set isinvariant with respect to an IFS is equivalent to say that it is a fixed point of thecorresponding Hutchinson operator.

The following theorem gives us a hint when an invariant set can be expected, andis one of the two basic properties of in the deterministic IFS theory (the other oneis the Collage Theorem). Of course you might notice that it is just a special case ofa basic property of contractions on complete metric spaces, namely the existence

Page 29: 1. Intro · The main application of the theory is generating and passing information about fractals, especially the class of self-similar fractals. A fractal is de ned to be a set

ITERATED FUNCTION SYSTEMS , THE CHAOS GAME AND INVARIANT MEASURES 29

of an attractive fixed point. So one can say that this is a corollary to the Banachfixed point theorem (Thm. 2).

Theorem 13 (Existence of invariant sets [3], [13]).Let (X,S) be a contractive IFS.

Then the operator W given by Equation 9 is a contraction on the metric space(H(X), h), and there exists a unique invariant set AS ∈ H(X). Furthermore

∀A ∈ H(X), we have limn→∞

Wn(A) = AS .

Proof.To prove this theorem one first notices that W is a contraction with respect to theHausdorff metric, and then the rest follows from Banach Fixed Point Theorem, seeThm. 2.

Remark 25.If E ∈ H(X) is such that Si(E) ⊂ E, for i = 1, . . . , N , we see that W k(E) ⊂W k−1(E) so the sequence W k(E) of non-empty compact sets is decreasing, and theintersection

⋂∞k=1W

k(E) is thus compact and non-empty. Furthermore since thesequence is decreasing

W (∞⋂k=1

W k(E)) =∞⋂k=1

W k+1(E) =

=∞⋂k=1

W k(E)

Thus the intersection is invariant under W and by the uniqueness we have

AS =∞⋂k=1

W k(E)

For any set E ∈ H(X) s.t Si(E) ⊂ E, i = 1, . . . , N .

This invariant set also has the important property of being self-similar in thesense that

AS =N⋃i=1

Si(AS)

So AS can be covered by smaller images of itself. Hutchinson(in [13]) takes thisconcept further and obtains a way to ’address’ each point in AS , by a point in theCode space of N symbols (see 2.5.2). That is

x ∈ X =⇒ ∃ω ∈ Ω(N), ω = (ω1, ω2, . . . ) and x =∞⋂p=1

ASω1,...,ωp(10)

Furthermore this addressing is done in a continuous way, so points with nearbyaddresses (the metric defined by Eq. 8) appears close in AS . Furthermore, allpoints in code space correspond to points in AS . For a proof of these statementssee 2.5.2. This strongly suggests that there is a link between the code space on Nsymbols and any IFS with N maps. This is going to be developed later in 3.4.

Page 30: 1. Intro · The main application of the theory is generating and passing information about fractals, especially the class of self-similar fractals. A fractal is de ned to be a set

30 FREDRIK STROMBERG

A basic property of contractions in complete metric spaces is the following theo-rem, that is stated as a lemma in [2], and that is put in this section because of thecorollary that follows.

Theorem 14.Let (Y, d) be a complete metric space. Let f : Y → Y be a contraction mappingwith contractivity factor 0 ≤ s < 1, and let the fixed point asserted by Theorem 2be yf ∈ Y . Then

d(y, yf ) ≤ (1− s)−1 · d(y, f(y)), ∀y ∈ Y.

Proof.Since the function d(a, ·) is a continuous function for fixed a ∈ Y .

d(y, yf ) = d

(y, limn→∞

fn(x)

)= limn→∞

d(x, fn(x))

(by the triangle inequality)

≤ limn→∞

n∑m=1

d(fm−1(x), fm(x))

(since fm−1 is clearly a contraction with contractivity factor sm−1)

≤ limn→∞

d(x, f(x))(1 + s+ · · ·+ sn−1)

(the geometric series expansion)

≤ (1− s)−1d(x, f(x)).

A special but interesting case of the above theorem is when X is a completemetric space, and (Y, d) = (H(X), h), and the contraction map f = W theHutchinson operator. This is the version used in connection with deterministic IFSand then it is called the Collage Theorem (cf. Barnsley [2]). Since we are interestedin IFS theory here I state this special case as a corollary from the above theorem.

Corollary 15 (The Collage Theorem).Let (X, d) be a complete metric space. Let L ∈ H(X), and ε ≥ 0 be given. Choosean hyperbolic IFS (X,S) with contractivity factor 0 ≤ s < 1, so that

h

(L,

N⋃n=1

Sn(L)

)≤ ε,

where h is the Hausdorff metric. Then

h(L,A) ≤ ε

1− s,

where A is the attractor of the IFS. Equivalently we have

h(L,A) ≤ 11− s

h

(L,

N⋃n=1

Sn(L)

), for all L ∈ H(X).

Page 31: 1. Intro · The main application of the theory is generating and passing information about fractals, especially the class of self-similar fractals. A fractal is de ned to be a set

ITERATED FUNCTION SYSTEMS , THE CHAOS GAME AND INVARIANT MEASURES 31

Hutchinson [13] points out that different IFSs can have the same invariant set(see example 11), and this indicates that we need a finer characteristic of the IFSthan the invariant set.

This is one of the motivations for introducing a measure on the invariant set,and there is also a practical use for this measure, since it is this measure that insome sense makes the so called chaos game work, and as we will see the chaos gameis a very efficient way to generate pictures of the invariant sets.

3.3. Invariant measure and the chaos game

Definition 3.6.Assume that (X,S, ~p) is a probabilistic IFS. Then the probabilities pi, i = 1, . . . , Ncan be used to determine a random walk in X by choosing a starting point x0 ∈ X,and then choosing integers in from 1, . . . , N with the probability Prob(in =j) = pj . Then set xn = Sin(xn−1), for n = 1, 2, . . . . This procedure is called TheChaos Game (from Barnsley in [2]) .

Notation 20. The set xnn≥0 is called the orbit of x0 (under (S, ~p) ).

This random walk can be treated in a formal way as a Markov process on X,exactly what this means is not important here, but if Zk is a random variable onX representing the possible locations of xk in X, with the associated probabilities,then the way that (Zk)k≥0 changes from Zn to Zn+1 is independent of the previouslytraced out path x0, x1, . . . , xn, , and is determined only by

Prob(xn+1 = y |x0, . . . , xn) =N∑i=1

pi Prob(Si(xn) = y) = Prob(xn+1 = y|xn).

More generally the process is determined by the following transition probability,from a point x ∈ X, to a point in a set B ⊆ X.

P (x,B) =N∑i=1

piχB(Six)

where χB(x) =

1 if x ∈ X,0 if x 6∈ X

is the characteristic function of the set B.To study probabilistic IFS’s in terms of measures on X we will need an operator

M (a Markov operator) on these measures, which will be as useful as the Hutchinsonoperator W we used for the Deterministic IFS’s. There are three equivalent, butcomplementary ways of defining and viewing M :

Definition 3.7. (1) Using the probabilities ~p = (p1, . . . , pN ) and the maps S =S1, . . . , SN, define M on the space Mr(X,R) of finite regular signed Borelmeasures on X by

(11) Mν(B) =N∑i=1

piν(S−1i (B))

Page 32: 1. Intro · The main application of the theory is generating and passing information about fractals, especially the class of self-similar fractals. A fractal is de ned to be a set

32 FREDRIK STROMBERG

where B ∈ B(X). If P(X) is the space of probability measures on X, thenclearly M(P(X)) ⊂ P(X) since

Mν(X) =N∑i=1

piν(S−1i (X)) =

N∑i=1

piν(X) = ν(X)N∑i=1

pi = 1

(2) We can define an operator T on CB(X), the space of bounded continuousfunctions on X by

(12) Tf(x) =N∑i=1

pif(Si(x)).

Then M is the dual operator of T . Also M maps P(X) into itself weak*-continuously (see Def. 2.25) . That T is the dual of M is seen by

ν(Tf) =∫X

Tf(x)dν(x) =N∑i=1

pi

∫X

f Si(x)dν(x)

(let y = Si(x), remember S−1i (X) = X, and dν(x) = dν(S−1

i (y)))

=N∑i=1

pi

∫X

f(y)dν(S−1i (y))

=∫X

f(y)(N∑i=1

pidν S−1i (y))

=∫X

f(y)Mν(y)

= Mν(f),

which is what we wanted.To show that M is weak *-continuous we consider the subbasic set U =

Ua,b,φ = µ : a < µ(φ) < b of the weak *-topology, where a < b, andφ ∈ CB(X). We shall show that the inverse image of this set under M isanother subbasic element.

M−1(Ua,b,φ) = ν : a < Mν(φ) < b

=

ν : a <

N∑i=1

pi

∫X

φ(x)d(ν S−1i ) < b

=

ν : a <

N∑i=1

pi

∫X

φ Si(x)dν < b

Page 33: 1. Intro · The main application of the theory is generating and passing information about fractals, especially the class of self-similar fractals. A fractal is de ned to be a set

ITERATED FUNCTION SYSTEMS , THE CHAOS GAME AND INVARIANT MEASURES 33

now, if φ ∈ CB(X) and all Si ∈ C(X) clearly Tφ =∑Ni=1 piφ Si ∈ CB(X)

=

ν : a <

∫X

(N∑i=1

piφ Si)(x)dν < b

=

ν : a < ν(

N∑i=1

piφ Si) < b

=

ν : a < ν(Tφ) < b

= Ua,b,Tφ,

and we are done.(3) We can also view M as the integral of the transition probability. If ν is an

initial distribution of a random variable on X, then it is clear that ν is alsoa probability measure and we define

(13) Mν(B) =∫X

P (x,B)dν(x).

If ν0 is the distribution of the random variable Z0, then vn = Mnν0 isthe distribution of Zn on X. This definition of M is equivalent with theprevious one, since

Mν(B) =∫X

P (x,B)dν =∫X

(N∑i=1

piχB(Si(x)))dν

=N∑i=1

pi

∫X

χB(Si(x))dν =N∑i=1

pi

∫S−1i (B)

1 · dν

=N∑i=1

piν(S−1i (B)),

which agrees with the first definition.

Now we are in the mood of defining an invariant measure of an hyperbolic IFS.

Definition 3.8.If ν ∈ P(X), then ν is said to be invariant with respect to the probabilisticIFS (X,S, ~p) if

Mν = ν

where M is defined above in definition 3.7.

Remark 26.In [3] an invariant measure is called a ~p-balanced measure.

The existence and uniqueness of an invariant measure for a hyperbolic IFS withprobabilities follows from the next theorem.

Theorem 16.Let (X, d) be a complete and locally compact metric space, and let (X,S, ~p) be ahyperbolic IFS with probabilities, and let 0 < s < 1 be the contractivity factor of S.Then

Page 34: 1. Intro · The main application of the theory is generating and passing information about fractals, especially the class of self-similar fractals. A fractal is de ned to be a set

34 FREDRIK STROMBERG

(1) M : P(X)→ P(X) is a contraction map in the Hutchinson metric, dH , andthe contractivity factor of M is s.

(2) There exists a unique µ ∈ P(X), s.t. Mµ = µ, and if ν ∈ P(X), then

dH(µ,Mnν) −→n→∞

µ

(3) Moreover the support of µ is the invariant set of the IFS, i.e. supp(µ) = AS .

Proof. (1) Let µ, ν ∈ P(X), and let f be any map in Lip1(X). Then

|Mµ(f)−Mν(f)| =

=

∣∣∣∣∣∫X

f(x)N∑i=1

pidµ(S−1i (x))−

∫X

f(x)N∑i=1

pidν(S−1i (x))

∣∣∣∣∣=

∣∣∣∣∣N∑i=1

pi

(∫X

f(x)dµ(S−1i (x))−

∫X

f(x)dν(S−1i (x))

)∣∣∣∣∣(change variables to y = S−1

i (x) in the integrals, S−1i (X) = X )

=

∣∣∣∣∣N∑i=1

pi

(∫X

f Si(y))dµ(y)−∫X

f Si(y)dν(y)

)∣∣∣∣∣=

∣∣∣∣∣N∑i=1

pi

(µ(f Si)− ν(f Si)

)∣∣∣∣∣=

∣∣∣∣∣N∑i=1

pis

(µ(

1sf Si)− ν(

1sf Si)

)∣∣∣∣∣(

notice that for any x, y ∈ X we have |fSi(x)−fSi(y)| ≤ d(Si(x), Si(y)) ≤

sid(x, y) ≤ s · d(x, y). so the map f Si is a contraction with contractivity

factor s, and the map 1sf Si is in Lip1(X) for all i = 1, . . . , N

)

≤ s ·N∑i=1

pi sup

µ(h)− ν(h) : h ∈ Lip1(X)

= s ·N∑i=1

pidH(µ, ν) = s · dH(µ, ν),

and since f was any map in Lip1(X) we can take supremum over all mapsin Lip1(X) to obtain dH(Mµ,Mν) ≤ s · dH(µ, ν), thus M is a contractionon P(X) in the metric dH .

(2) This part of the proof can be found in less detail in [3]. Let T : CB(X)→CB(X) be defined as in Def. 3.7. It is clear that since all maps Si ∈ S arecontinuous, T is continuous. Them the dual operator T ∗ = M : P(X) →P(X) is weak *-continuous. Prop 16 implies that P(X) is weak *-compact,

Page 35: 1. Intro · The main application of the theory is generating and passing information about fractals, especially the class of self-similar fractals. A fractal is de ned to be a set

ITERATED FUNCTION SYSTEMS , THE CHAOS GAME AND INVARIANT MEASURES 35

and if α ∈ [0, 1] and µ, ν ∈ P(X) then

‖αµ+ (1− α)ν‖ = αµ(X) + (1− α)ν(X) = α · 1 + (1− α) · 1 = 1.

Thus αµ + (1 − α)ν ∈ P(X), for all µ, ν ∈ P(X) and all α ∈ [0, 1], whichmeans that P(X) is convex. Now the Schauder fixed point theorem (Thm.3) implies that M has a fixed point µ ∈ P(X).

And if we assume that µ, ν ∈ P(X) are two fixed points of M , Mµ = µand Mν = ν then we get

dH(µ, ν) = dH(Mµ,Mν) ≤ s · dH(µ, ν),

and since 0 < s < 1 this is impossible if dH(µ, ν) 6= 0. Thus dH(µ, ν) = 0,and µ = ν, this proves that the invariant measure is unique.Let ν ∈ P(X), and let µ be the invariant measure, Mµ = µ, then

dH(Mnν, µ) = dH(Mnµ,Mµ)

≤ s · dH(Mn−1ν, µ) = s · dH(Mn−1ν,Mµ)

≤ s2 · dH(Mn−2ν, µ)...

≤ sn · dH(ν, µ),

and since 0 < s < 1, it is clear that

dH(Mnν, µ)→ 0, as n→∞.

(3) It is clear that if µ is an invariant measure, then

µ(A) =N∑i=1

piµ(S−1i (A))

for all Borel sets A. It follows that supp(µ) =⋃Ni=1 Si(supp(µ)), and hence

the support of µ is the unique invariant set of the IFS.

Remark 27.Another way to prove the existence of an invariant measure if (X, d) is a separa-ble metric space is to use that the metric topology and the weak *-topology areequivalent on P(X) (c.f Thm. 11). The fact that P(X) is compact in the weak*-topology then implies that it is compact in the metric topology also and hencethat it is complete in the metric topology. Then the Banach fixed point theoremcan be used instead of the Schauder.

Remark 28.Later we will see that the existence of, plus an explicit formula for the invariantmeasure can be obtained by other means as well (see Thm. 20 below). That proofis applicable when X is not locally compact. The locally compactness of X aboveis only needed to use any duality arguments.

Notation 21. The unique measure invariant with respect to the hyperbolic IFS(X,S, ~p) will often be denoted by µ(S,~p) (the space X is implicit) .

Page 36: 1. Intro · The main application of the theory is generating and passing information about fractals, especially the class of self-similar fractals. A fractal is de ned to be a set

36 FREDRIK STROMBERG

Definition 3.9.Let (X,S, ~p) be a hyperbolic IFS with probabilities, and let µ(S,~p) be the uniqueinvariant measure of the IFS. Let T be defined as in definition 3.7. A closed subsetG of X is called recurrent if

∀O ⊆ G, open and 6= ∅, and ∀x ∈ G ∃n ∈ N s.t. (TnχO)(x) > 0.

Here χO(x) = 1, if x ∈ O and 0 otherwise.(So χO is the characteristic function ofthe set O).

Theorem 17.Let µ = µ(S,~p) be the unique invariant measure of the probabilistic IFS (X,S, ~p),with ~p > 0. If G ⊆ X is recurrent for T and has µ(G) > 0, then every non-emptyopen subset O, of G has strictly positive measure.

For a proof see [3]

3.4. Universal IFS

In this section we will investigate results concerning the so called universal IFS,that is defined on the complete space Ω(N). The motivation to study this IFS isthe comment following Theorem 13.

Definition 3.10.Let ± = σi : σi(ω) = i · ω, where i · ω denotes concatenation. ( ω =(ω1, ω2, . . . ) =⇒ i ·ω = (i, ω1, ω2, . . . )) Then the IFS (Ω,±) is called the universalIFS, and it is easy to see that this is a contractive IFS.

The reason to call this IFS the ’universal’ is that via the addresses of points inthe invariant set of a contractive IFS we can define a continuous map from Ω ontoAS , and in certain situations this map is also one-to-one. Assume that (X,S) is acontractive IFS. Then the map from Ω onto AS is defined as follows. Let ω ∈ Ω,and x ∈ X, and set

(14) φn(ω, x) = Sω1 · · · Sωn(x), n ∈ 1, 2, . . . .

Now we can let n→∞, and set

(15) φ(ω, x) = limn→∞

φn(ω, x).

This map has properties given in the following theorem.

Theorem 18.Let (X, d) be a complete and locally compact metric space, and let (X,S) be ahyperbolic IFS, with unique invariant set AS . Then we have that for each x ∈ X,and each ω ∈ Ω, y = φ(ω, x) ∈ AS exists, and is independent of x (we may thuswrite just φ(ω)) . φ is a continuous map from Ω onto AS . I.e. φ(Ω) = AS

For a proof see [2]We are now going to define a Borel measure on the space Ω.

Definition 3.11.If ~p is a probability vector with N elements, and Ω = Ω(N), then the Borel field

Page 37: 1. Intro · The main application of the theory is generating and passing information about fractals, especially the class of self-similar fractals. A fractal is de ned to be a set

ITERATED FUNCTION SYSTEMS , THE CHAOS GAME AND INVARIANT MEASURES 37

B(Ω) is generated by the cylinders

ω ∈ Ω |ωl = il, n ≤ l < n+ k

where each il ∈ 1, . . . , N. We define the Borel measure ρ by

ρ(ω ∈ Ω |ωl = il, n ≤ l < n+ k) =n+k−1∏l=n

pil

This measure is called a Bernoulli measure. Note that Ω =⋃Ni=1ω ∈ Ω | ω1 = i,

a disjoint union, and thus ρ(Ω) =∑pi = 1, so ρ is a probability measure.

If we define MΩ = M and TΩ = T as in Definition 3.7 for the IFS (Ω,Σ, ~p), wehave the following theorem proved in [3]

Theorem 19.The probabilistic IFS (Ω,Σ, ~p) (with ~p > 0) , together with the Bernoulli measureρ, have the following properties;

(1) (Ω,Σ) is a hyperbolic IFS, with attractor AΣ = Ω;(2) ρ is the unique invariant measure for the probabilistic IFS (Ω,Σ, ~p);(3) Ω is recurrent for Σ, in particular the support of ρ is Ω, independent of

~p > 0;(4) for all B ∈ B(Ω),

ρ(σi(B)) = piρ(B), i = 1, . . . , N ;

(5) for all g ∈ C(Ω),∥∥∥∥∥TΩg −∫

Ω

gdρ

∥∥∥∥∥∞

→ 0, as n→∞;

(6) TΩ is mixing with respect to ρ in the sense that for all f, g ∈ C(Ω)∫Ω

(TΩg)fdρ −→∫

Ω

gdρ

∫Ω

fdρ as n→∞.

To connect what we know about the universal IFS with the other IFSs thefollowing theorem is proved in [3].

Theorem 20.Let (X,S, ~p) be a hyperbolic IFS with probabilities, ~p > 0. Then there is a uniqueinvariant measure µ, given by µ(E) = ρ(φ−1(E)), for E ∈ B(X), where ρ is theBernoulli measure, and φ is as in Theorem 18. Furthermore

(1) µ is attractive for any ν ∈ P(X) (in the metric dH) ,(2) the attractor AS of the IFS (without probabilities) (X,S) is recurrent for

the map T , and(3) the support of the invariant measure µ, supp(µ) = AS .

Remark 29.From the definition of recurrent set (Def 3.9) we see that if x0 ∈ AS , and x0, x1, . . . is the orbit of x0 then for every open, non-empty subset O of AS there exists n ∈ Ns.t. xn ∈ O, that is the orbit x0, x1, . . . , where xn = Si(xn−1) with probabilitypi , is dense in the invariant set AS .

Page 38: 1. Intro · The main application of the theory is generating and passing information about fractals, especially the class of self-similar fractals. A fractal is de ned to be a set

38 FREDRIK STROMBERG

This is exactly how/why the chaos game works! In Example 9 at the end ofthis section I will give an example of how to apply the chaos game and also compareit with the deterministic method.

Definition 3.12.Let (X,S) be a hyperbolic IFS with attractor A. and Ω(N) the associated codespace. Let φ : Ω(N) → A be the continuous function constructed in Theorem 18.An address of a point a ∈ A is any member of the set

φ−1(a) = ω ∈ Ω(N) | φ(ω) = a

This set is called the set of addresses of a ∈ A.

(1) The IFS is said to be totally disconnected if each point a ∈ A possessesa unique address, i.e. ∀a ∈ A, ∃!ωa ∈ Ω(N), s.t. φ−1(a) = ωa.

(2) The IFS is said to be just-touching if it is not totally disconnected , butthe attractor A contains an open set O s.t.

Si(O) ∩ Sj(O) = ∅, ∀i 6= j ∈ 1, . . . , N.(i)N⋃i=1

Si(O) ⊂ O.(ii)

An IFS whose attractor contains an open set O satisfying (i) and (ii) above is saidto satisfy the open set condition.

If the IFS is neither just-touching nor disconnected then it is said to be over-lapping.

From [2] we also get the following theorem describing which IFS are totallydisconnected.

Theorem 21.Let (X,S) be a hyperbolic IFS with attractor A. The IFS is totally disconnected ifand only if

Si(A) ∩ Sj(A) = ∅,∀i 6= j ∈ 1, 2, . . . , N.

Example 9.Assume that we have an IFS(X,S), and we want to draw a computer picture ofthe corresponding invariant set. How can we do that?

As presented in this text, there are two possible ways to do this, either the de-terministic way involving iteration of the Hutchinson operator, or the probabilisticway with the chaos game. In most cases the chaos game is better.

I will illustrate this by applying both methods in a specific case and then comparethe results, in terms of the graphical quality and the cpu time needed.

One of the most famous and early examples of a self similar fractal is the Levydragon, studied in 1938 by P. Levy, who among other things observed that theplane could be tiled by copies of this fractal. In [7] the specific form of this tilingis described, and we also learn that the dragon is the invariant set of the following

Page 39: 1. Intro · The main application of the theory is generating and passing information about fractals, especially the class of self-similar fractals. A fractal is de ned to be a set

ITERATED FUNCTION SYSTEMS , THE CHAOS GAME AND INVARIANT MEASURES 39

IFS:

(R2,S = (S1, S2)), where S1, and S2 are the (contractive) similtudes given by

S1((xy

)) =

(12 − 1

212

12

)(xy

), and

S2((xy

)) =

(12

12

− 12

12

)(xy

)+(

1212

).

The Levy dragon will be denoted by D from here on.

(1) We start with the deterministic method. First we need to determine acompact subset of R2 that we shall show on the screen, since we for obviousreasons can not show the whole of R2.

I decide that we restrict ourselves to the square Y = [−1, 1]×[−1, 1], andwe will accordingly show only a part of the whole Dragon. Then we needanother compact set A ∈ H(X) that will be the starting point of iteration.Here we can take A = Y , the whole square.

Now we have all we need in theory to apply Thm. 13 and with Wdenoting the Hutchinson operator of Def. 3.4 we should be easily gettingD as D = lim

n→∞Wn(A). But what seems so straightforward in theory need

not be that in practice.We must translate all this to a computable algorithm, and the way I do it

is almost the same as Barnsley in [2]. Points on the screen are representedby elements of a matrix (named screen) and the resolution of the resultingpicture is therefore determined from the beginning by the size of this matrix.

Then I have an indicator matrix s of the same size, that have the prop-erty that in the step k of iteration s(k, l) = 1 if the point at location (k, l)belongs to the set Ak = W k(A), and s(k, l) = 0 otherwise. This indi-cator matrix will start out by being all 1s and then after each iterationthe number of 1s will decrease. Assuming we know which points belongto Ak (i.e. we have iterated W k times) we get the points of Ak+1 viaa loop over all k and l where s(k, l) = 1 and determine which indexesk′, l′, k′′, l′′ that represents the two points screeen(k′, l′) = S1(screen(k, l))and screen(k′′, l′′) = S2(screen(k, l)). Then we use a temporary matrix tthat we set as t(k′, l′) = t(k′′, l′′) = 1 (t is a zero matrix to start with).

After this loop the t matrix holds the information of which points are inthe set Ak+1, and for the next step we assign s = t, and t = 0.

After a suitable number of iterations we stop and plot the resulting pic-ture. The results are shown below in in connection with the conclusions.

(2) Now we shall see how much easier it is to use the chaos game. The firstthing to do is to assign probabilities to the hyperbolic IFS in equation , sincewe just want the invariant set we can take equal probabilities, p1 = p2 = 1

2 .We now have the probabilistic IFS (R2,S, ~p = ( 1

2 ,12 ))

The only thing we need to determine now is a starting point z0, and howmany points in the orbit of z0 we want to use in the picture. We can easilysee that S1(0, 0) = (0, 0) so the point (0, 0) belongs to the invariant set.Thus a good starting point is z0 = (0, 0) and we let M denote the numberof points we shall calculate in the orbit.

Page 40: 1. Intro · The main application of the theory is generating and passing information about fractals, especially the class of self-similar fractals. A fractal is de ned to be a set

40 FREDRIK STROMBERG

Since we found an initial point belonging to the invariant set D, we knowfrom Theorem 20 and Remark 29 that the orbit will be dense in the set D,that is for any open set O in D there will be some point zn ∈ O.

The procedure of calculating the orbit z0, z1, . . . , zM is straightfor-ward, I use some predefined function to generate a random vector prob oflength M with elements probk ∈ [0, 1] Then I just loop over k from 1 to Mand in each step assign

zk =

S1(zk−1) if probk ∈ [0, 1

2 ),S2(zk−1) if probk ∈ [ 1

2 , 1],

This is it! After M steps we have M points dense in D, and when we plotthem we get a good picture of the set. The result is shown below togetherwith the comparison of the two methods.

(3) Even without looking at any pictures or figures at all, we can see thatthe deterministic method is much harder to translate into a computablealgorithm, involving more complicated steps.

The actual implementations of these algorithms was done in MATLABand performed on a pentium 166MMX processor.

In Figure 1 the sets A3, A7, A11 and A15 are plotted. The number ofpoints representing the square is 200× 200,

It is hard to see any fine structure at all in the deterministic approxi-mations, the approximations does not even tend to be better after no. 13,and there seems to be something missing in the pictures, compared to theones from the chaos game.

If we now look at Figure 2, where the two upper deterministic picturesare to be compared with the lower pictures generated by the chaos game.The cpu time needed to generate the respective pictures are roughly equal.First I made the pictures by the chaos game with 50000 points and it took56s of cpu time to generate. Then I tried to get the best deterministicpicture with the same amount of cpu time. With the resolution set to 100pixels in the picture and performing 15 iterations I obtained the picture inthe figure, and it took 58s of cpu time. This is the best I can get in thistime. If I increase the resolution to 200 and make 15 iterations it takes300s.

In both cases the pictures to the right is obtained from the ones to the leftby rescaling of the axes. What we see here is that while the deterministicpicture loses information and becomes more ’pixelized’ when zoomed in at,the probabilistic picture looks almost the same at different scales. Of coursethis is the behaviour we expect for a self-similar fractal and in this sensethe chaos game gives us at least a statistically better self-similarity undermagnification.

One can easily imagine how much easier it became to study fractalsexperimentally when the chaos game was invented.

Page 41: 1. Intro · The main application of the theory is generating and passing information about fractals, especially the class of self-similar fractals. A fractal is de ned to be a set

ITERATED FUNCTION SYSTEMS , THE CHAOS GAME AND INVARIANT MEASURES 41

4. Fractal dimensions

4.1. Introducing the Hausdorff and Packing measures

4.1.1. Hausdorff measures

The theory of the Hausdorff measures is presented in [8] and [18], from where Ihave taken the definition and the theorems in this section, and to which I refer thereader that wants to know more about this suybject.

Definition 4.1 (Hausdorff k-dimensional measure).Let s, δ ∈ R, s ≥ 0, δ > 0 , and A ⊆ X we first define the δ-approximates-dimensional Hausdorff measure by

hsδ(A) = inf

∞∑n=1

α(s)

(diam(Ai)

2

)s| Ai∞i=1 is a δ-cover of A

where

α(s) =πs/2

Γ( s2 + 1)(the volume of B(0, 1) in Rn)

Here Γ(s) =∫ ∞

0

e−xxs−1dx is the usual Gamma function

( we have for example Γ(n) = (n− 1)!, n = 1, 2, . . . , Γ(z + 1) = zΓ(z),

and Γ(12

) =√π).

Then we define, for A and s as above

Hs(A) = limδ→0

hsδ(A) = supδ≥0

hsδ(A)

Hs is called the Hausdorff s-dimensional measure on X.

Theorem 22.Hs is a Borel regular measure on Rn, (0 ≤ s <∞).

Proof.See [8].

One of the most useful properties of the Hausdorff measures is that if we considerthe metric space Rn with the usual metric then the Hausdorff n-dimensional mea-sure is equal to the n-dimensional Lebesgue measure. That is n-dimensional objectshave their n-dimensional Hausdorff measure equal to their ordinary n-dimensional”volume”.

More precise we have the following theorem from [8].

Page 42: 1. Intro · The main application of the theory is generating and passing information about fractals, especially the class of self-similar fractals. A fractal is de ned to be a set

42 FREDRIK STROMBERG

Theorem 23.

H0 is the counting measure(i)

H1 = L1on R1(ii)

Hs ≡ 0 on Rn for all s > n.(iii)

Hs(λA) = λsHs(A) for all λ > 0, A ⊆ Rn.(iv)

Hs(L(A)) = Hs(A) for all isometries L : Rn → Rn.(v)

Hn = Lnon Rn.(vi)

Proof.See [8].

Remark 30.In some books (for example [10]) the approximate Hausdorff measures are definedwithout the normalization constants (α(s)

2s ) leaving only∑∞n=1(diam(Ai))s and then

we instead have Hn = 2n

α(n)Lnon Rn.

The following properties of the Hausdorff measures regarding Lipschitz, bi-Lipschitz maps and similtudes are basic but important. They are stated in forexample [10] and [18].

Proposition 18.Let (X, d) and (Y, h) be metric spaces, and let E ⊂ X.

If f : E → Y is a Lipschitz mapping (c.f Def. 2.21) s.t.(i)

h(f(x), f(y)) ≤ c · d(x, y), ∀x, y ∈ Ethen we have

Hs(f(E)) ≤ csHs(E).

Similarly if f : E → Y is a bi-Lipschitz mapping (c.f Def. 2.22),(ii)

s.t. for some 0 < c1 ≤ c2, c1d(x, y) ≤ h(f(x), f(y)) ≤ c2d(x, y), ∀x, y ∈ Ethen

c1sHs(E) ≤ Hs(f(E)) ≤ c2sHs(E).

A special case of (ii) is when f is a similtude of scalingfactor r,(iii)

i.e. h(f(x), f(y)) = r · d(x, y), ∀x, y ∈ E (c.f. Def. 2.20)then

Hs(f(E)) = csHs(E)

Proof.

Let δ > 0, E ⊂∞⋃i=1

Ci, diam(Ci) ≤ δ.(i)

Then

Page 43: 1. Intro · The main application of the theory is generating and passing information about fractals, especially the class of self-similar fractals. A fractal is de ned to be a set

ITERATED FUNCTION SYSTEMS , THE CHAOS GAME AND INVARIANT MEASURES 43

diam(f(Ci)) ≤ Lip(f)diam(Ci) ≤ Lip(f)

and f(E) ⊂∞⋃i=1

f(Ci),

so

hsLip(f)·δ(f(E)) ≤∞∑i=1

α(s)

(diam(f(Ci))

2

)s

≤ (Lip(f))s∞∑i=1

α(s)

(diam(Ci)

2

)sIf we take ”inf” over the δ-cover Ci we get

hsLip(f)·δ(f(E)) ≤ (Lip(f))shsδ(E)

and when δ → 0 we get

Hs(f(E)) ≤ (Lip(f))sHs(E).

The same proof as above with the inequalities reversed proves that(ii)

c1d(x, y) ≤ h(f(x), f(y)) ∀x, y ∈ E =⇒ Hs(f(E)) ≥ (c1)sHs(E).

This together with (i) implies that

c1sHs(E) ≤ Hs(f(E)) ≤ c2sHs(E),

if f satisfies the condition c1d(x, y) ≤ h(f(x), f(y)) ≤ c2d(x, y) ∀x, y ∈ E.

This follows from (ii) if c1 = c2(iii)

4.1.2. Packing measure

Now parallel to the definition of Hausdorff measure we can define Packing measures,by using a ’δ-packing’ instead of a δ-cover.

Definition 4.2.A δ-packing of a set A is a finite or countable family of disjoint balls Bi of radiiat most δ and with centres in A.

For δ > 0 and A ⊆ X, we define

psδ = sup

∞∑n=1

α(s)

(diam(Bi)

2

)s| Bi∞i=1 is a δ-packing of A

where

α(s) =πs/2

Γ( s2 + 1)

Now we can take the limit

ps0(A) = limδ→0

psδ(A)

Page 44: 1. Intro · The main application of the theory is generating and passing information about fractals, especially the class of self-similar fractals. A fractal is de ned to be a set

44 FREDRIK STROMBERG

Now, ps0(A) need not be countably subadditive, and hence need not be a measure.But we overcome this by instead defining

Ps(A) = inf

∞∑i=1

ps0(A) | A ⊂∞⋃i=1

Ai

which is a Borel measure, called the s-dimensional packing measure of E.

Packing measures behaves in the same way as Hausdorff measures with respectto Lipschitz mappings, so that Proposition 18 holds with Hs replaced by Ps. It canalso be shown that for all sets A we have Hs(A) ≤ Ps(A). The primary referencefor all facts concerning the packing measures is [10].

4.2. Hausdorff, packing and box-counting dimensions

We said in the introduction that a fractal is a set with Hausdorff dimension differentfrom the topological dimension, and now we shall make that precise. We shall alsodefine other notions of dimensionality that are related to fractals and IFSs.

We begin by considering a theorem concerning the k-dimensional Hausdorff mea-sure defined in 4.1. For the proof we need the following lemma (from [8] in caseX = R

n)

Lemma 3. Let (X, d) be a metric space, and let A ⊂ X, and 0 ≤ s < t <∞.

(1) If Hs(A) <∞, then Ht(A) <∞.(2) If Ht(A) > 0, then Hs(A) = +∞.

Proof.Let Hs(A) < ∞ and δ > 0. Then by the definition of hsδ, and the meaning of infthere must exist a δ-cover Cj∞j=1 of A s.t.

∞∑j=1

α(s)

(diam(Cj)

2

)s≤ hsδ(A) + 1 ≤ Hs(A) + 1.

Then we get

htδ(A) ≤∞∑j=1

α(t)

(diam(Cj)

2

)t

=α(t)α(s)

2s−t∞∑j=1

α(s)

(diam(Cj)

2

)s(diam(Cj))t−s

≤ α(t)α(s)

2s−tδt−s(Hs(A) + 1).

Now, s < t, so the right hand side will tend to 0 if we let δ → 0, and accordinglywe get htδ(A)→ 0, as δ → 0 which implies that Ht(A) = 0. This proves (i).

Now if Ht(A) > 0, it follows from (i) that Hs(A) can not be smaller than∞ thusHs(A) = +∞, and (ii) is proved.

We first define the Hausdorff dimension, and then prove a theorem concerningthe uniqueness of the number defined.

Page 45: 1. Intro · The main application of the theory is generating and passing information about fractals, especially the class of self-similar fractals. A fractal is de ned to be a set

ITERATED FUNCTION SYSTEMS , THE CHAOS GAME AND INVARIANT MEASURES 45

Definition 4.3 (Hausdorff Dimension).Let A ⊆ X, then we define the real number DH(A) by

DH(A) = inf0 ≤ s <∞ | Hs(A) = 0

This number is called the Hausdorff Dimension of A, or the Hausdorff - Besi-covitch dimension of A.

Remark 31.If the set A is infinite or if we define sup∅ = 0 (usually defined as −∞). then wealso have DH(A) = sup0 ≤ s <∞ | Hs(A) = +∞.

It is clear that we have

Hs(A) =

∞ if s < DH and s ∈ [0,∞),0 if s > DH and s ∈ [0,∞).

Since if s < DH , and Hs(A) <∞, the above lemma (i) tells us that Ht(A) = 0, ∀t ∈(s,DH) , which would imply s = DH , and thus leads to a contradiction and weconclude that Hs(A) =∞ if s < DH .

Similarly if s > DH and Hs(A) > 0, part (ii) of the lemma implies that Ht(A) =∞, ∀t ∈ (DH , s) , which also would imply that s = DH , and thus Hs(A) = 0 ifs > DH .

The above definition leaves us the number HDH (A) ∈ [0,∞] undetermined, butstill unique for each set A ⊆ X. The number DH is also denoted by DH(A) tostress that it depends on the set A. Formally we now have the following definition.

Definition 4.4 (Packing Dimension).As for the Hausdorff dimension, there is a number DP (A) called the packingdimension of A, s.t.

(16) Ps(A) =

∞ if s < DP and s ∈ [0,∞),0 if s > DP and s ∈ [0,∞).

Thus

DP (A) = infs : Ps(A) = 0 = sups : Ps(A) =∞.

Sometimes one speaks about the ’dimension’ of a fractal set and then meaningeither the above Hausdorff dimension which is usually theoretically determined, orthe so called box-counting dimension, which has the advantage of also being prac-tical to determine experimentally, and therefore can be calculated for any picturein R2 even if it is not described by any function.

In [2] the box-counting (or upper box-counting) dimension are referred to as justfractal dimension.

Definition 4.5 (Box-counting dimension).Let (X, d) be a complete metric space, A ∈ H(X), and let N (A, ε) denote theminimum number of sets of diameter ε needed to cover A. The upper and lower

Page 46: 1. Intro · The main application of the theory is generating and passing information about fractals, especially the class of self-similar fractals. A fractal is de ned to be a set

46 FREDRIK STROMBERG

box-counting (or box) dimensions are defined as

DB(A) = lim supε→0

lnN (A, ε)

ln(1ε )

(17)

DB(A) = lim infε→0

lnN (A, ε)

ln(1ε )

(18)

If these are equal their common value is called the box-counting (or box) di-mension of A

(19) DB(A) = limε→0

lnN (A, ε)

ln(1ε )

Remark 32.When we want to calculate the box-counting (or upper-, or lower box-counting)dimension it might be useful to know that for A ⊆ Rn the number N (A, ε) can bedefined in other, maybe more convenient ways than above. [10] tells us that thevalue of the limits in (17) - (19) is unaltered if N (A, ε) is defined as any of thefollowing

(1) the smallest number of sets of diameter at most ε that cover A,(2) the smallest number of closed balls of radius ε that can cover A,(3) the smallest number of cubes of side ε that cover A,(4) the largest number of disjoint balls of radius ε with centres in A,(5) the number of ε-mesh cubes that intersect A, hence the name ’box counting’.

(An ε-mesh cube in Rn is a cube of the form

[m1ε, (m1 + 1)ε)× · · · × [mnε, (mn + 1)ε)

where m1, . . . ,mn are integers.)

Even though we now have 5 different definitions of the box-counting dimensionsof A, all these definitions are very similar in that they all concern some kind ofpartition of the set A into smaller and smaller pieces. The only difference is theshape of the pieces, and if they are to be contained completely inside A or not.

There is yet another, conceptually different but still equivalent definition of thebox-counting dimensions in Rn.

Definition 4.6.Let A ⊆ Rn, and let the ε- neighbourhood of A, Aε be defined as

Aε = x ∈ Rn : ∃y ∈ A, s.t |x− y| ≤ ε.Then we define

DB(A) = n− lim infε→0

lnLn(Aε)

ln ε

(20)

DB(A) = n− lim supε→0

lnLn(Aε)

ln ε

(21)

and

DB(A) = n− limε→0

lnLn(Aε)

ln ε

Page 47: 1. Intro · The main application of the theory is generating and passing information about fractals, especially the class of self-similar fractals. A fractal is de ned to be a set

ITERATED FUNCTION SYSTEMS , THE CHAOS GAME AND INVARIANT MEASURES 47

if the limit exists. When defined in this way, the box-counting dimension is some-times referred to as the Minkowski dimension

Remark 33.From [9] we get the following expression of the packing dimension in terms of theupper box dimension.

DP (A) = inf

sup

1≤i<∞DB(Ai) : A ⊆

∞⋃i=1

Ai

.

The infimum is over all countable covers Ai1≤i<∞ of A.

Remark 34.To determine the box-dimension of any picture A we note that A is bounded andclosed, so A ∈ H(R2). Then for N > 0 we take a set of small numbers εi > 0, sayi = 1 . . . N and compute the number of balls with these radii that it takes to coverA. Then we calculate the number lnN (A,εi)

ln( 1εi

)and we hope that this number tends

to some particular value when we take let i → N , if not we take a larger N andtry again. If it exists, the limit will be our approximation of the fractal dimensionD(A).

The box-counting dimensions are only treated in [10], [9], and under the nameof fractal dimension in [2]. Barnsley proves (in [2]) the following theorem for DB

in the special but useful case X = Rn, n > 0.

Theorem 24 (Existence and bound of the fractal dimension).Let m be a positive integer, and consider the metric space (Rn, de), n > 0, wherede is the usual Euclidean distance, de(x, y) = |x− y|, for, x, y ∈ Rn. (This space iscomplete so Def 4.5 makes sense). If A,B ∈ H(Rn) is s.t A ⊆ B then

DB(A) ≤ DB(B), and in particular

0 ≤DB(A) ≤ n.

The following theorem that is stated and proved in [2] helps us to calculate thefractal dimension of the union of two sets in Rn.

Theorem 25.Let m be a positive integer. Consider the metric space (Rn, de), n > 0. Let A,B ∈H(Rn). Let A be such that its fractal dimension is given by

DB(A) = DB(A) = DB(A) = limε→0

lnN (A, ε)

ln(1ε )

Suppose that DB(B) < DB(A). Then

DB(A ∪B) = DB(A).

So the total fractal dimension does not change if we add a set with a lower fractaldimension.

Page 48: 1. Intro · The main application of the theory is generating and passing information about fractals, especially the class of self-similar fractals. A fractal is de ned to be a set

48 FREDRIK STROMBERG

All dimensions that we have defined have some properties in common. In [10] thefollowing properties are stated, where D(A) can be any of DH(A), DP (A), DB(A),DB(A), or DB(A):

Proposition 19. (1) If A1 ⊆ A2, then D(A1) ≤ D(A2). (Monotonicity)(2) If A is finite, then D(A) = 0. (Finite sets)(3) If A is a non-empty open subset of Rn, then D(A) = n. (Open sets)(4) If A is a smooth m-dimensional manifold in Rn then D(A) = m. (Smooth

manifolds)(5) If f : A → R

n is a Lipschitz mapping then D(f(A)) ≤ D(E). (Lipschitzmappings)

(6) If f : A→ f(A) is bi-Lipschitz then D(f(A)) = D(A). (Bi-Lipschitz inv.)(7) If f is a similarity or affine transformation then D(f(A)) = D(A). (Geo-

metric inv.)

It is also stated in [10] that Hausdorff, packing and upper box-counting dimen-sions are finitely stable, i.e.

D(k⋃i=1

Ei) = max1≤i≤k

D(Ei)

Where D stands for any of Hausdorff, packing or upper box-counting dimension.(Lower box-counting dimension is not finitely stable.)

Hausdorff and packing dimensions are even better than this, they are countablystable, i.e.

D(∞⋃i=1

Ei) = sup1≤i<∞

D(Ei)

Where D stands for any of Hausdorff or packing dimensionSome other properties of dimensions, that are stated in [10] without proof, are

the following inequalities.

Proposition 20.Let E ⊆ X be any non-empty set, and let F ⊆ X be any nonempty bounded set.Then

DH(E) ≤ DP (E),

DH(F ) ≤ DP (F ) ≤ DB(F ), and

DH(F ) ≤ DB(F ) ≤ DB(F ),

and we even have that almost all common definitions of dimension take valuesbetween Hausdorff and upper box dimensions.

The above imply that if we can prove that DH(A) = DB(A), then we know thatboth the lower box-counting and packing dimension also takes this common value.

Now we have defined some different dimensions that we want to work with, andwe have also stated some basic properties and inequalities for these.

The main concern now is to calculate dimensions, but in general it is very hard toobtain an exact answer. It is easier to find some estimates for the dimensions, and itis usually easier to get upper estimates than lower estimates. We also notice that ifwe get a lower bound s1 for the Hausdorff dimension, or an upper bound s2 for the

Page 49: 1. Intro · The main application of the theory is generating and passing information about fractals, especially the class of self-similar fractals. A fractal is de ned to be a set

ITERATED FUNCTION SYSTEMS , THE CHAOS GAME AND INVARIANT MEASURES 49

upper box-counting dimension then if D is any common dimension s1 ≤ D ≤ s2.Therefore we are usually satisfied if we find bounds for the Hausdorff and upperbox dimensions.

Almost all common ways to find the dimension of a set is involving measuresthat are supported in the set.

The following proposition can be found in [10] (in the case X = Rn) and gives

us a lower estimate for the Hausdorff dimension via the so called mass distributionprinciple.

Proposition 21 (Mass Distribution Principle).Let X be a complete metric space, let E ⊆ X and let µ be a finite measure withµ(E) > 0. Suppose that there are numbers s ≥ 0, c > 0, and δ0 > 0 such that

µ(U) ≤ c(diam(U))s

for all sets U with diam(U) ≤ δ0. Then Hs(E) ≥ 1cµ(E) and

s ≤ DH(E) ≤ DB(E) ≤ DB(E).

Definition 4.7.We define the upper and lower local dimension of µ at x ∈ X by

Dlocµ(x) = lim supr→0

lnµ(B(x, r))ln r

Dlocµ(x) = lim infr→0

lnµ(B(x, r))ln r

From [10] we get also get the following two propositions with proof in [9].

Proposition 22.Let E ⊆ Rn be a non-empty Borel set and let µ be a finite measure.

(a) If Dlocµ(x) ≥ s for all x ∈ E and µ(E) > 0 then DH(E) ≥ s.(b) If Dlocµ(x) ≤ s for all x ∈ E then DH(E) ≤ s.(c) If Dlocµ(x) ≥ s for all x ∈ E and µ(E) > 0 then DP (E) ≥ s.(d) If Dlocµ(x) ≤ s for all x ∈ E then DP (E) ≤ s.

This proposition tells us why the local dimensions of measures are so interesting.If we want to obtain a lower or upper bound for the Hausdorff or packing dimensionof a Borel set E, we just have to find one finite measure that has a correspondingbound for either lower (Hausdorff) or upper (packing) local dimension throughoutE.

The second proposition is a partial converse of the first. For a Borel set of a givendimension there exists a measure with positive measure of E, and a correspondinglocal dimension.

Proposition 23.Let E ⊂ Rn be a non-empty Borel set.

(a) If DH(E) > s there exists a measure µ with 0 < µ(E) <∞ and Dlocµ(x) ≥s, ∀x ∈ E.

(b) If DH(E) < s there exists a measure µ with 0 < µ(E) <∞ and Dlocµ(x) ≤s, ∀x ∈ E.

(c) If PH(E) > s there exists a measure µ with 0 < µ(E) <∞ and Dlocµ(x) ≥s for µ-almost all x ∈ E.

Page 50: 1. Intro · The main application of the theory is generating and passing information about fractals, especially the class of self-similar fractals. A fractal is de ned to be a set

50 FREDRIK STROMBERG

(d) If PH(E) < s there exists a measure µ with 0 < µ(E) <∞ and Dlocµ(x) ≤s, ∀x ∈ E.

The two following theorems of a more geometrical nature, are stated and provedin [10] and give us some knowledge about dimensions without explicitly calculatingthem. We will use them in Section 4.3 to prove the important Theorem 28 andthere we will see how to apply these theorems in practice. In [10] both theoremsare stated for X = R

n, but the same proofs work for general complete metric spaces.

Theorem 26.Let (X, d) be a complete metric space, and let E ∈ H(X), a > 0, r0 > 0. Supposethat for every set U that intersects E, with diam(U) < r0 there is a mappingg : E ∩ U → E satisfying

a(diam(U))−1d(x, y) ≤ |g(x)− g(y)| (x, y ∈ E ∩ U).

Then, writing s = DH(E), we have Hs(E) ≥ α(s)as > 0 and DB(E) = DB(E) = s.Where α(s) is as in Def. 4.1.

In this theorem we require that any small enough part of the set can be mappedinto a larger part of the set.

Theorem 27.Let (X, d) be a complete metric space, and let E ∈ H(X), a > 0, r0 > 0. Supposethat for every closed ball B with centre in E and radius r < r0 there is a mappingg : E → E ∩B satisfying

a · rd(x, y) ≤ |g(x)− g(y)|, (x, y ∈ E).

Then, writing s = DH(E), we have that Hs(E) ≤ α(s)4sa−s < ∞, and DB(E) =DB(E) = s. Where α(s) is as in Def. 4.1.

Here the requirement is different from that in the previous theorem in that wewant all small neighbourhoods to contain images of the whole set, that are no toocontracted.

4.3. IFS and Fractal dimensions

Previously in Subsection 4.2 the only way we could really calculate any fractaldimension of a set was directly from the definitions.

In this subsection we will learn easier ways of obtaining dimensions for some spe-cial sets. If the set we want to determine the fractal dimension of can be describedas the invariant set of a hyperbolic IFS where all maps are similtudes we can useTheorem 28 below which is a combination of Thm. 2.7 in [10] , Thm. 8 in [3] andThm 5.3.1 in [13], where the proofs of the different parts can be found.

In the proof of this theorem I will make use of the following definition, from [10].

Definition 4.8.If X is a metric space, and A,B ⊂ X we define the distance between A and Bas

dist(A,B) = infd(x, y)|x ∈ A, y ∈ B

Page 51: 1. Intro · The main application of the theory is generating and passing information about fractals, especially the class of self-similar fractals. A fractal is de ned to be a set

ITERATED FUNCTION SYSTEMS , THE CHAOS GAME AND INVARIANT MEASURES 51

Clearly this need not in general be a metric, but what we will use it for is theobvious fact that dist(A,B) > 0, if and only if A and B are disjoint, whenever Aand B are compact. (The implication ”⇒” is always valid.)

There is also a geometric lemma needed. This lemma is stated and proved in[10].

Lemma 4. Let (X, d) be a finite dimensional Euclidean metric space (i.e. X =Em). Let r, a1, a2 > 0 be real numbers and let Vi be a collection of disjoint opensubsets of X such that each Vi contains a ball of radius a1r, and is contained ina ball of radius a2r. Then any ball B of radius r intersects at most (1 + 2a2)sa−s1

of the closures Vi, where s is the Hausdorff dimension of X (also the Hausdorffdimension of each of the Vi).

Proof.If V i meets B then Vi is contained in the ball concentric with B of radius (1 +2a2)r. Suppose that q of the sets V i intersect B. Then summing the s-dimensionalHausdorff measures of the interior balls of radii a1r, we get

qHs(’open ball of radius a1r’) ≤ Hs(’open ball of radius (1 + 2a2)r’)

By the scaling properties of the Hausdorff measures we get

q(a1r)sHs(’open ball of radius 1’) ≤ ((1 + 2a2)r)sHs(’open ball of radius 1’)

Since s was the Hausdorff dimension of the space X, we have

0 < Hs(’open ball of radius 1’) <∞.

So dividing out by r and the measure of an open unit ball, we get what we wanted

q ≤ (1 + 2a2)s(a1)−s

Theorem 28.Let (X,S) be a hyperbolic IFS, (X is a complete metric space) where Si ∈ S aresimiltudes with scaling factors si ∈ (0, 1), for i = 1, . . . , N , and let A denote theunique attractor of this IFS.

(a) If either(i) X is a complete metric space, and the IFS is totally disconnected, or(ii) X is a finite dimensional Euclidean space (s.t. the above Lemma

applies) and the IFS just-touching, we have

DH(A) = DP (A) = DB(A) = DB(A) = D and moreover

0 < HD(A) < PD(A) <∞,

where D is the so called similarity dimension of A and is defined as theunique positive solution of

N∑i=1

siD = 1.

(b) If X is a complete metric space and the IFS is overlapping or just touching,then we have the estimate

DH(A) = DP (A) = DB(A) ≤ D,

Page 52: 1. Intro · The main application of the theory is generating and passing information about fractals, especially the class of self-similar fractals. A fractal is de ned to be a set

52 FREDRIK STROMBERG

where D is defined as in (a). Furthermore we have the estimate

HD(A) <∞

Proof. (a) (i) We first consider the case when the IFS is totally disconnectedand X is any complete metric space. Define smin = min

1≤i≤Nsi, the

smallest of the scaling factors. Clearly 0 < smin < 1. Let φ and φn bedefined as in Subsection 3.4 Let x ∈ A. Definition 3.12 tells us thatthere exists a unique address ω = (ω1, ω2, . . . ) ∈ Ω(N) s.t. x = φ(ω).So for all k > 0 we have x ∈ φk(ω,A) = Sω1 . . . Sωk(A), and for fixedω and k φk(ω, y), as a function of y is a composition of similtudes thusφk(ω, y) is a similtude with scaling factor ck = sω1 · · · sωk , 0 < ck < 1.Pick r > 0 arbitrarily, and let B = B(x, r). We can now let k be sobig that

sminr < ck · diam(A) ≤ rThus the function

φk(ω, y) : A −→ A ∩Bis a similtude with scaling factor greater than sminr(diam(A))−1.Now we can apply Theorem 27. Let r0 > 0, and a = smin(diam(A))−1 >0. If B is a closed ball with centre x ∈ A and radius r < r0, clearly bythe arguments above there is ω ∈ Ω, and k > 0 s.t. φk(ω, y) : A −→A ∩B satisfies

d(φk(ω, y), φk(ω, x)) ≥ sminr(diam(A))−1d(x, y) = a · r · d(x, y)

Which means that the requirements of Theorem 27 are fulfilled, andwe get

(22) DH(A) = DP (A) = DB(A) = D, and HD(A) <∞.(It will be shown later that D is the similarity dimension.)We now have to get the lower estimate for the Hausdorff dimension ofA, which still is considered to be totally disconnected. By Theorem 21Si(A)∩Sj(A) = ∅, for i 6= j, and since A is compact, so is Aj = Sj(A)for all j = 1, . . . , N . Thus, dist(Ai, Aj) > 0 for i 6= j (dist defined inDef. 4.8). Define δ = min

i 6=j∈1,...,Ndist(Ai, Aj) > 0.

Claim 28.1.We have

dist(Ai1,...,ik , Aj1,...,jk) ≥ si1 . . . sik−1δ

for (i1, . . . , ik) 6= (j1, . . . , jk).

Proof of Claim 28.1.Since Sj(A) = Aj ⊂ A, for all j = 1, . . . , N , it is clear that for anysequence h = (h1, . . . , hk) ∈ Ωk we have that Ah1,...,hk ⊂ Ah1,...,hk−1 ,and inductively that Ah1,...,hk ⊂ Ah1,...,hn for any n < k. If A ⊂ A′,and B ⊂ B′ is any subsets of X, the distance defined in Def 4.8 clearlyobeys dist(A,B) ≥ dist(A′, B′). Thus

dist(Ai1,...,ik , Aj1,...,jk) ≥ dist(Ai1,...,in , Aj1,...,jn),

Page 53: 1. Intro · The main application of the theory is generating and passing information about fractals, especially the class of self-similar fractals. A fractal is de ned to be a set

ITERATED FUNCTION SYSTEMS , THE CHAOS GAME AND INVARIANT MEASURES 53

for all integers 0 < n < k.For (i1, . . . , ik) 6= (j1, . . . , jk) define n0 as the index of the first elementsthat are different, i.e.

in = jn, n < n0 and in0 6= jn0

Clearly, if (i1, . . . , ik) 6= (j1, . . . , jk) the sequences must disagree some-where, so 1 ≤ n0 ≤ k.From this we see that

Ai1,...,ik ⊆ Ai1,...,in0, and Aj1,...,jk ⊆ Aj1,...,jn0

so we have the following inequality

dist(Ai1,...,ik , Aj1,...,jk) ≥ dist(Ai1,...,in0, Aj1,...,jn0

),

and since the sequences agree for all indexes lower than n0

dist(Ai1,...,in0, Aj1,...,jn0

) = dist(Ai1,...,in0−1,in0, Ai1,...,in0−1,jn0

).

Remember that Ai1,...,in0−1 = Si1 · · · Sin0−1(A)( =def

Si1,...,in0−1(A)),

and that Si1 · · · Sin0−1 is a similtude of scaling factor si1 . . . sin0−1

so we have

dist(Ai1,...,in0, Aj1,...,jn0

) =

= dist(Si1,...,in0−1(Sin0(A)), Si1,...,in0−1(Sjn0

(A))

= si1 . . . sin0−1dist(Sin0(A), Sjn0

(A))≥ si1 . . . sin0−1δ

≥ si1 . . . sik−1δ

independent of n0 ≤ k.

Let U ⊂ X intersect A, with 0 < diam(U) < δ and x ∈ U ∩A.

Claim 28.2.We can find (i1, . . . , ik) ∈ Ωk s.t. x ∈ Ai1,...,ik , and

(*) δsi1 . . . sik ≤ diam(U) < δsi1 . . . sik−1

Proof of Claim 28.2.Since A is totally disconnected there is a unique sequence ω in Ω,ω = (i1, i2, . . . ) s.t. x ∈ Aω, and thus x ∈ Ai1,...,il for all l > 0. Wewant to prove by contradiction that there must exist k > 0 s.t. (*) isvalid. Assume that no such k exists.We know that diam(U) < δ, so diam(U) < si1δ necessarily, since else(*) would hold for k = 1.Let l > 0, assume that diam(U) < si1 . . . sil−1δ, then if (*) doesnot hold for k = l, we must have diam(U) < si1 . . . silδ, and thusdiam(U) < si1 . . . si(l+1)−1δ.Since diam(U) < si1δ the induction argument above shows that indeedfor any integer l ≥ 1 we must have

diam(U) < si1 . . . silδ

Page 54: 1. Intro · The main application of the theory is generating and passing information about fractals, especially the class of self-similar fractals. A fractal is de ned to be a set

54 FREDRIK STROMBERG

If we let l → ∞ this implies that si1 . . . sil → 0 (since 0 < sim < 1),and thus diam(U)→ 0. But diam(U) is independent of l, so we musthave diam(U) = 0. This is a contradiction with the assumption ofdiam(U) > 0 when we first chose U .Therefore there exists k > 0 s.t.

δsi1 . . . sik ≤ diam(U) < δsi1 . . . sik−1

These two claims together imply that U must be disjoint from Aj1,...,jkfor all (j1, . . . , jk) 6= (i1, . . . , ik) ∈ Ωk. So A ∩ U ⊂ Ai1,...,ik . Hence wecan define the map

g : A ∩ U → A

by g = (Si1 · · · Sik)−1. Then g is a similtude with scaling factorc = (si1 . . . sik)−1. By Claim 28.2

c ≥ δ(diam(U))−1.

So the requirements of Theorem 26 are fulfilled and we get thatDH(A) =DP (A) = DB(A) = D and 0 < HD(A).Finally, if the sets Si(A), i = 1, . . . , N are disjoint we get D =DH(A) = DP (A) = DB(A) = DB(A), and

HD(A) =N∑i=1

HD(Si(A)) = (prop 18(iii)) =N∑i=1

sDi HD(A)

Since we have also obtained 0 < HD(A) <∞ we can divide the aboveequation by HD(A) to obtain

1 =N∑i=1

sDi ,

and we are done in the case when the IFS is totally disconnected andX is a complete metric space. .

(ii) Now we consider the case of when the IFS satisfies the open set con-dition (OSC) (cf. Def. 3.12).As always it is much easier to obtain an upper estimate for the Haus-dorff measure, so we start with that. Let D be the similarity dimensionof the IFS, that is D satisfies

N∑i=1

sDi = 1

Let Ωk = Ωk(N) = (i1, . . . , ik) | 1 ≤ ij ≤ N, j = 1, . . . (cf. Subsec-tion 2.5.2), and we know that A =

⋃Ni=1Ai, so we clearly have

A =⋃

(i1,...,ik)∈Ωk

Ai1,...,ik

for any k > 0, and we shall check that these covers of A gives us anupper estimate for the Hausdorff measure of A. Since the map Si1,...,ik

Page 55: 1. Intro · The main application of the theory is generating and passing information about fractals, especially the class of self-similar fractals. A fractal is de ned to be a set

ITERATED FUNCTION SYSTEMS , THE CHAOS GAME AND INVARIANT MEASURES 55

is a similarity with scaling factor si1 . . . sik we get∑(i1,...,ik)∈Ωk

diam(Ai1,...,ik)D =∑

(i1,...,ik)∈Ωk

(si1 . . . sik)Ddiam(A)D

= (N∑i1=1

sDi1) . . . (N∑ik=1

sDik)diam(A)D

= diam(A)D

Now for any δ > 0, we can choose k s.t.

diam(Ai1,...,ik) ≤ ( max1≤j≤N

cj)kdiam(A) ≤ δ

So the collection Ai1,...,ik(i1,...,ik)∈Ωk is a δ-cover of A.

=⇒ hDδ (A) ≤∑

(i1,...,ik)∈Ωk

diam(Ai1,...,ik)D = diam(A)D

for all δ > 0 =⇒ HD(A) ≤ diam(A)D <∞, and thus DH(A) ≤ D.We have not used that the IFS satisfied the OSC, so we have provedthat if X is a complete metric space, then

(*) HD(A) <∞, and DH(A) ≤ D,where D is defined by

N∑i=1

sDi = 1

Now to obtain the lower estimate we shall show that the invariantmeasure of the IFS satisfies the conditions of the Mass DistributionPrinciple, Prop. 21. To prove this I want to use Lemma 4 and in orderto do that I have to assume that X is a finite dimensional Euclideanspace Consider the probabilistic IFS (X,S, ~p), where the probabilitiesare defined by pi = sDi (we assumed

∑Ni=1 s

Di = 1.) Let ρ be the

associated Bernoulli measure in definition 3.11. i.e.

ρ(ω ∈ Ω |ωl = il, n ≤ l < n+ k) = (sin . . . sin+k−1)s

Clearly ρ(Ω) = 1. Denote the cylinder sets in Ω with fixed initialsequence by

Ii1,...,ik = ω ∈ Ω |ωl = il, 1 ≤ l ≤ kclearly ρ(Ii1,...,ik) = (ci1 . . . cik)D

Now the invariant measure µ of the IFS can be obtained by Theorem20, and is defined to be

µ(E) = ρ φ−1(E), E ⊂ X,where φ : Ω → A is the continuous function from Theorem 18. Weclearly have µ(A) = 1, and the support of µ is contained in A.Let V be the open set asserted by the OSC, then

V ⊃W (V ) =N⋃i=1

Si(V )

Page 56: 1. Intro · The main application of the theory is generating and passing information about fractals, especially the class of self-similar fractals. A fractal is de ned to be a set

56 FREDRIK STROMBERG

and remark 25, implies that W k(V ) A. In particular A ⊂ V andAi1,...,ik ⊂ V i1,...,ik for all (i1, . . . , ik) ∈ Ωk. Let B be any ball of radiusr < 1 that intersects A. We want to estimate µ(B), and this is done inprinciple by counting the sets Vi1,...,ik with diameters comparable to r(in a sense to be specified later) and with closures intersecting A ∩B.

Claim 28.3.For any ω = (i1, i2, . . . ) ∈ Ω there exists at least one integer k > 0 s.t.

(23) ( min1≤i≤N

si)r ≤ si1 . . . sik ≤ r

Proof of Claim 28.3.We can assume by appropriate reordering that

min1≤i≤N

si = s1, and that max1≤i≤N

si = sN

Define the sequence of real numbers

aj = si1 . . . sij

Now, since 0 < si < 1, i = 1, . . . , N it is clear that 0 < aj < 1, j =1, . . . . What we want to show is that there exists an integer k > 0 s.t.ak ∈ [s1r, r] ⊂ (0, 1). We are going to do this by contradiction.Assume that there is no k s.t. ak ∈ [s1r, r].

a1 = si1 ≥ s1 > s1r =⇒ a1 > r, since else k = 1 would work(1)⇓

a2 = a1si2 > si2r ≥ s1r =⇒ a2 > r, since else k = 2 would work(2)⇓...⇓

an = an−1sin > sinr ≥ s1r =⇒ an > r, since else k = n would work(n)

This shows that for any n we have 1 > a1 > a2 > · · · > an > r. But ifwe define δj to be the distance between successive aj ’s we get

δj = |aj − aj−1| = |aj−1(sij − 1)| > r(1− sij ) ≥ r(1− sN ) = δ0 > 0.

So the spacings between successive aj ’s are greater than a fixed numberδ0 > 0 and we can not possibly squeeze in n of aj ’s between 1 and r if nis large enough (n > 1−r

δ0will do). This is a contradiction to the above

conclusion ”1 > a1 > a2 > · · · > an > r”, so our assumption thatthere were no suitable k must be wrong. Consequently there exists anumber k > 0 s.t.

ak−1 > r, and s1r ≤ ak ≤ rThus

s1r ≤ si1 . . . sik ≤ r

Page 57: 1. Intro · The main application of the theory is generating and passing information about fractals, especially the class of self-similar fractals. A fractal is de ned to be a set

ITERATED FUNCTION SYSTEMS , THE CHAOS GAME AND INVARIANT MEASURES 57

Now we are able to define Q by

Q = (i1, . . . , ik) |(i1, i2, . . . ) ∈ Ω

and k is the smallest integer s.t. eq. (23) holdsClaim 28.3 tells us that for every sequence (i1, i2, . . . ) ∈ Ω there is aunique k > 0 s.t. (i1, . . . , ik) ∈ Q.We know from the formulation of the OSC that Vi ∩ Vj = ∅ for i, j ∈1, . . . , N and i 6= j, and the next claim is a generalisation of this, tothe case of distinct sequences i = (i1, . . . , ik) 6= j = (j1, . . . , jk).

Claim 28.4.If (i1, . . . , ik) 6= (j1, . . . , jk), then Vi1,...,ik ∩ Vj1,...,jk = ∅.

Proof of Claim 28.4.We start by showing that Vi1,...,ik,m∩Vi1,...,ik,n = ∅ whenever m 6= n ∈1, . . . , N, for any (i1, . . . , ik) ∈ Ωk. As usual we note that but

Vi1,...,ik,m = Si1 · · · Sik(Vm), and

Vi1,...,ik,n = Si1 · · · Sik(Vn),

and that F = Si1 · · · Sik is a similtude with scaling factor c =si1 . . . sik . Assume that x ∈ F (Vm) ∩ F (Vn), this implies that thereexists v ∈ Vm, and w ∈ Vn s.t. F (v) = x, and F (w) = x by definition.But now, clearly

0 = d(x, x) = d(F (v), F (w)) = c · d(v, w)

and 0 < c < 1 =⇒ d(v, w) = 0. By the definition of a metric(Def. 2.10) we see that v = w. This is a contradiction to the factthat Vm and Vn were disjoint, and so the assumption we made of theexistence of x ∈ F (Vm) ∩ F (Vn) must be wrong. Thus we concludethat Vi1,...,ik,m ∩ Vi1,...,ik,n = F (Vm) ∩ F (Vn) = ∅, independently of(i1, . . . , ik) ∈ Q.Now let (i1, . . . , ik) 6= (j1, . . . , jk) ∈ Q. Define the integer n0 > 0 bythe condition

in0 6= jn0 , but in = jn, n < n0

i.e. n0 is the first index n at which in and jn disagree. It is clear thatif i, j ∈ 1, . . . , N then Vj ⊂ V and then Vi,j = Si(Vj) ⊂ Si(V ) = Vi.Recursively this implies that

Vi1,...,ik ⊂ Vi1,...,ik−1 ⊂ · · · ⊂ Vi1,...,in0⊂ · · · ⊂ Vi1 , and

Vj1,...,jk ⊂ Vj1,...,jk−1 ⊂ · · · ⊂ Vj1,...,jn0⊂ · · · ⊂ Vj1

So

Vi1,...,ik ⊂ Vi1,...,in0−1,in0

Vj1,...,jk ⊂ Vj1,...,jn0−1,jn0= (by Def. of n0) = Vi1,...,in0−1,jn0

By the conclusion above we get what we want:

Vi1,...,ik ∩ Vj1,...,jk = ∅,whenever (i1, . . . , ik) 6= (j1, . . . , jk).

Page 58: 1. Intro · The main application of the theory is generating and passing information about fractals, especially the class of self-similar fractals. A fractal is de ned to be a set

58 FREDRIK STROMBERG

Now, if a ∈ A, there exists a sequence (i1, i2, . . . ) ∈ Ω s.t. a ∈ Ai1,i2,...and thus there exists an integer k > 0 s.t. (i1, . . . , ik) ∈ Q, anda ∈ Ai1,...,ik =⇒

A ⊂⋃

(i1,...,ik)∈Q

Ai1,...,ik ⊂⋃

(i1,...,ik)∈Q

V i1,...,ik

Now choose a1 and a2 s.t. V contains a ball of radius a1 (possiblesince V is open and X metric), and is contained in a ball of radius a2

(possible since V ⊂ A, and A is compact).Then since B(x, a1) ⊂ V ⇒ Si1 · · · Sik(B(x, a1)) ⊂ Vi1,...,ik for(i1, . . . , ik) ∈ Q, the set Vi1,...,ik contains a ball of radius si1 . . . sika1,and from the construction ofQ we deduce that ( min

1≤i≤Nsi)a1r ≤ si1 . . . sika1

so V must contain a ball of radius si1 . . . sika1. Similarly si1 . . . sika2 ≤a2r, so V must be contained in a ball of radius a2r.Let Q1 denote those sequences (i1, . . . , ik) in Q s.t. B intersectsV i1,...,ik . By Lemma 4 there are at most q = (1 + 2a2)s(a1 min

1≤i≤Nsi)−s

sequences in Q1 , where s is the Hausdorff dimension of X. We usethe same notation as before for the cylinder sets,

I(i1,...,ik) = (ω1, ω2, . . . ) ∈ Ω | ωl = il, l = 1, . . . , k

Then using the facts that x ∈ A ∩ B ⇒ ∃(i1, . . . , ik) ∈ Q1 s.tx ∈ V i1,...,ik and also the inequality obeyed by all sequences in Q, thusQ1, si1 . . . sik ≤ r we get

µ(B) = µ(A ∩B) ≤ ρ((i1, i2, . . . ) | xi1,i2,... ∈ A ∩B)

≤ ρ(⋃

(i1,...,ik)∈Q1

Ii1,...,ik)

=∑Q1

ρ(Ii1,...,ik)

=∑Q1

(si1 . . . sik)D

≤∑Q1

rD ≤ rDq

Since any set U is contained in a ball of radius diam(U), we have

µ(U) ≤ (diam(U))Dq,

so the requirements of the mass distribution principle (Prop. 21) aremet, and we get the conclusion

Hs(A) ≥ q−1 > 0, and DH(A) = D.

Now we have both the upper and lower estimate for Hs(A) but we stillneed to check that the upper box dimension equals s.If Q is any set of sequences s.t. for every (i1, i2, . . . ) ∈ Ω there ex-ists exactly one integer k s.t. (i1, . . . , ik) ∈ Q it follows inductivelyfrom

∑Ni=1 s

Di = 1, that

∑Q(si1 . . . sik)D = 1 Thus if Q is chosen as

Page 59: 1. Intro · The main application of the theory is generating and passing information about fractals, especially the class of self-similar fractals. A fractal is de ned to be a set

ITERATED FUNCTION SYSTEMS , THE CHAOS GAME AND INVARIANT MEASURES 59

we did above, si1 . . . sik ≥ mini sir and thus 1 =∑Q(si1 . . . sik)D ≥∑

Q(mini sir)D, and the number of terms in the sum can not be greaterthan (mini sir)−D. Thus Q contains at most q = (mini sir)−D se-quences. For each sequence (i1, . . . , ik) ∈ Q we have

diam(V i1,...,ik) = si1 . . . sikdiam(V ) ≤ rdiam(V ),

so A can be covered by q sets of diameter rdiam(V ), ∀0 < r < 1. Itfollows that from the equivalent definition of box dimension in Remark32 (i) that DB(A) ≤ D, so we have D ≤ DH(A) ≤ DP (A) ≤ DB(A) ≤D and there must be equality all the way.

D = DH(A) = DP (A) = DB(A) = D

So all conclusions of the theorem concerning just touching IFS arearrived at.

(b) Let (X, d) be a complete metric space, and let the IFS (X,S) be over-lapping. We note that the first half of (a).(i) leading to Equation 22 stillapplies, and hence that DH(A) = DP (A) = DB(A), and HDH (A) <∞. Ifwe let D be the similarity dimension of the IFS we can repeat the first partof (a).(ii) leading to Equation * which tell us that DH(A) ≤ D.

Putting all this together we conclude that

DH(A) = DP (A) = DB(A) ≤ D, and HDH(A)(A) <∞,

if X is any complete metric space, and the IFS is just touching or overlap-ping.

The version of this theorem found in [3] is often more useful in practice, eventough it only gives bounds for the Hausdorff dimension of the invariant set A, it isalso more general since there is no need for the functions Si to be similtudes, theyneed only be bi-Lipschitz mappings (c.f Def. 2.22).

Theorem 29.Let (X,S) be a hyperbolic IFS with attractor A. Assume that X ⊆ Rn, and thatthe maps S are bi-Lipschitz, s.t. for all i = 1, . . . , N there exists 0 < si ≤ si < 1s.t.

si|x− y| ≤ |Si(x)− Si(y)| ≤ si|x− y|, ∀x, y ∈ X and i = 1, . . . , N.

Define u and l as the two unique positive solutions toN∑i=1

sli = 1 andN∑i=1

sui = 1

Then we have(a) If the IFS is totally disconnected then we have minn, l ≤ DH(A) ≤ u,

and(b) if Si(A) ∩ Sj(A) 6= ∅ for some i 6= j, then we still have the upper bound,

DH(A) ≤ u.

Proof.See [3].

Page 60: 1. Intro · The main application of the theory is generating and passing information about fractals, especially the class of self-similar fractals. A fractal is de ned to be a set

60 FREDRIK STROMBERG

Here come some examples of the definitions and techniques we have seen in thissection.

Example 10 (Fractal dimension of Cantor set).To illustrate the notion of fractal dimension we take a look at a simple case on R,and the, so called, classical Cantor middle-thirds set.

First let the space X = R, and define a sequence of sets by starting with the unitinterval, C0 = [0, 1] and then successively form Cn by removing the open middlethirds of the intervals making up Cn−1. That is we have

C0 = [0, 1],

C1 = [0,13

] ∪ [23, 1],

C2 = [0,19

] ∪ [29,

39

] ∪ [69,

79

] ∪ [89, 1],

...

Cn = Cn−1 \ ”the open middle third of each interval that build up Cn−1”.

This gives us that Cn consists of 2n closed disjoint interval of length 1/3n. Wewrite

Cn =2n⋃j=1

I(n)j , diam(I(n)

j ) =13n,

and Cl ⊂ Ck for all integers l, k s.t. l > k. That is, the sequence Ckk∈N iswhat is usually called a decreasing sequence of sets. Now we define the Cantorset, or the Cantor middle-thirds set, C as the intersection of all these Cn’s.

C =∞⋂n=0

Cn.

Of course C is closed and bounded, and thus compact. This set has manyinteresting properties, of which we will only consider some here. In [4] the followingresults are proved:

(a) C contains no open interval.(b) C has the same cardinality as R, but Lebesgue measure = 0.

Now we shall use what we have learnt in the previous section to calculate someof the fractal dimensions of the Cantor set C.

We start with the first approach we should think of, namely to find the box-counting dimension by means of inspection of Figure 10. We assume that bothlower and upper box-counting dimensions exist and are equal, so that the box-counting dimension is defined. This is because we want to just take lim and notto care about lim sup or lim inf. Afterwards, when we find that the limit exists weconclude that this assumption was justified.

(a) First we note that since each Cn consists of 2n closed disjoint intervals oflength 1/3n we will need exactly 2n closed balls of radius 1/3n to coverCn. (Balls in R are of course just intervals). Now let εk = 1/3k, thenN (Ck, εk) = 2k. With the definition of N (Ck, εk) as the second alternativedefinition in Remark 32.

Page 61: 1. Intro · The main application of the theory is generating and passing information about fractals, especially the class of self-similar fractals. A fractal is de ned to be a set

ITERATED FUNCTION SYSTEMS , THE CHAOS GAME AND INVARIANT MEASURES 61

Since C is contained in Ck for all k, we must then have N (C, εk) ≤ 2k.The claim is that N (C, εk) = 2k.

Assume now that N (C, εk) < 2k for some integer k > 0, and let Ji, i =1, . . . , l, where l < 2k, be a set of intervals of length εk that covers C.

Since clearly C is contained in Ck+1 it is enough to show that Ck+1 cannot be covered with less than 2k intervals of length εk to prove the claim.

One clearly see that if x is an endpoint of an interval constituting Cnthen x ∈ C.

From Figure 10 it is clear that Ck+1 can not be covered with less than2k intervals of length εk if all endpoints of the intervals are to be covered.

Thus N (C, εk) = 2k, and Definition 4.5 gives us, if we assume that wecan let ε tend to zero in the same manner as εk tends to zero.

DB(C) = DB(C) = lim supε→0

lnN (C, ε)

ln(1ε )

: ε ∈ (0, ε)

= lim supk→∞

lnN (C, εl)

ln( 1εl

): k < l <∞

=

ln(N (C, εl)) = ln(2l) = l · ln(2), and ln(

1εl

) = ln(3l) = l · ln(3)

= lim supk→∞

l · ln(2)l · ln(3)

: k < l <∞

=

ln(2)ln(3)

.

So the box-counting dimension of the cantor set is,

DB(C) =ln(2)ln(3)

≈ 0.63 .

(b) Next approach is to use the local dimensions defined in Definition 4.7, to-gether with Proposition 22. We seek a Borel measure to use. Unfortunatelyas we noticed above, the Lebesgue measure of C is 0. So we can not usethat Borel measure. But we stated solely as an explanation of the nomen-clature in subsection 2.5.2 that the code space of two symbols is isomorphicto the Cantor middle-thirds set. Now that we know the Cantor set it isalmost obvious, since for any point x ∈ C we can find a sequence ω ∈ Ω(2)by choosing ωi = 0 if the point x belongs to the left side of the removedmiddle interval in step n in the construction, and ωn = 1 if x belongs tothe right part. See also Figure b.

Conversely for every sequence of 0s, and 1s in Ω(2) we can find exactlyone point in C by choosing the proper right or left interval in each step.That this gives a unique single point for an infinite sequence follows fromthe Cantor Intersection Theorem, see for example [21].

Now that we know that Ω(2) and C are isomorphic we can define a Borelmeasure on C via a Borel measure on Ω(2), and to find a good candidatewe use some IFS theory.

We can notice that C is invariant for the IFS (R, S1, S2), where S1(x) =13x, and S2(x) = 1

3x+ 23 as can be easily verified from the construction of C

above. Then if we look at for example the probabilistic IFS (R, S1, S2, ~p =

Page 62: 1. Intro · The main application of the theory is generating and passing information about fractals, especially the class of self-similar fractals. A fractal is de ned to be a set

62 FREDRIK STROMBERG

( 12 ,

12 )), we can now define the invariant Borel measure µ as in Theorem 20.

Where the Bernoulli measure now is given by

ρ(ω ∈ Ω(2) |ωl = il, n ≤ l < n+ k) =n+k−1∏l=n

12

= (12

)k,

and since we know one way (from above) to identify C and Ω(2) we knowwhat φ : C → Ω(2) is. Furthermore µ has support in C so µ(C) > 0,therefore it is now sensible to try to find the local dimensions of µ and tryto use the theorem.

Let x ∈ C, rn = 13n , and the closed ball Bn = B(x, rn) = y ∈ R | |x−

y| ≤ rn. Then Bn is an interval, and since x ∈ C, the intersection Bn ∩Cis contained in exactly one of the intervals I(n)

j from the construction of Cfor some j ∈ 1, . . . , 2n. This gives us the following formula for µ(Bn).

µ(B(x, rn)) = ρ(φ−1(B(x, rn))) = ρ(φ−1(I(n)j ))

= ρ(ω ∈ Ω(2) | ωl = il, 1 ≤ l ≤ n) =n∏l=1

12

= (12

)n

Where i = i1i2 . . . in are described by ik = 0 if I(n)j and thus x is contained

in the left part of Ck in the construction of C, and ik = 1 respectively if itis contained in the right one.

Thus we see that µ(B(x, rn)) = ( 12 )n independently of x ∈ C and Def-

inition 4.7 gives us the following upper and lower local dimension of µ atany x ∈ C.

Dlocµ(x) = lim supn→∞

lnµ(B(x, rn))ln rn

= lim supn→∞

ln(12 )n

ln(13 )n

= lim supn→∞

−n ln 2−n ln 3

= lim supn→∞

ln 2ln 3

=ln2ln3

, and

Dlocµ(x) = lim infn→∞

lnµ(B(x, rn))ln rn

...

= lim infn→∞

ln 2ln 3

=ln2ln3

.

With s = ln 2ln 3 all four requirements in Proposition 22 (a) - (d) are fulfilled

with µ as defined. Thus we get s ≤ DH(C) ≤ s, and s ≤ DP (C) ≤ s. Sowe have that the Hausdorff and packing dimension of the Cantor set C isequal and

DH(C) = DP (C) =ln(2)ln(3)

.

Page 63: 1. Intro · The main application of the theory is generating and passing information about fractals, especially the class of self-similar fractals. A fractal is de ned to be a set

ITERATED FUNCTION SYSTEMS , THE CHAOS GAME AND INVARIANT MEASURES 63

(c) We did use a little IFS theory above, but now we will discover the realstrength of the IFS approach, in this special case of the Cantor set.

As we noted above, the Cantor set is invariant under the IFS (R, S1, S2),where S1(x) = 1

3x, and S2(x) = 13x+ 2

3 (which can be easily verified fromthe construction of C).

Both S1 and S2 are similtudes with scaling factors s1 = s2 = 13 . We also

have

S1(C) = ”the left third of C ” = C ∩ [0,13

], and

S1(C) = ”the right third of C ” = C ∩ [23, 1], thus

S1(C) ∩ S2(C) = ∅.The requirements of Theorem 28 are now fulfilled, and this theorem gives

us that DH(C) = DP (C) = DB(C) = DB(C) = D, where D is the uniquepositive solution to equation a. But in this case the calculation is easy,

N∑i=1

|si|D = 1,⇒ (13

)D + (13

)D = 1 =⇒ (13

)D =12,⇒

−D ln 3 = − ln 2 =⇒ D =ln 2ln 3

.

So by this method we achieved the following dimensions

DH(C) = DP (C) = DB(C) = DB(C) =ln(2)ln(3)

.

(d) We can also use implicit the method, indicated in [10] which is an applica-tion of Theorem 26 together with Theorem 27.

First to use Theorem 26 we let a = 1, and r0 = 13n . Then let U be any

subset of R s.t. U ∩ C 6= ∅, and s.t diam(U) < r0 = 13n . From Figure 10,

and the construction of C above we see that U ∩ C must be contained inexactly one of the 2n remaining intervals of length 1

3n in Cn, denote thisinterval IU = [aU , aU + 1

3n ] ⊂ [0, 1].Now define the map g : C ∩ U → C by

g(x) = 3n(x− aU )

We can see that g is a map into C geometrically from the constructionof C. It is clear from the construction procedure that the part of C insideIU must be similar to the set C itself. What the map g does is just toprovide that similarity transformation. (We move the smaller copy of Cand enlarges it enough to fit the original.)

Now we have, for x, y ∈ U ∩ C

a1

diam(U)|x− y| = 3n|x− y| = |g(x)− g(y)|,

and so the requirement of Theorem 26 are fulfilled, and the implication thatwe are interested in are HDH(C)(C) > 0, and DB exists and is equal to DH .

The next step is to use Theorem 27. Let n ∈ N, r0 = 12

13n , and a = 2

3 .Then let B be any interval (ball) of length 2r, with centre in C, and wherer < r0

Page 64: 1. Intro · The main application of the theory is generating and passing information about fractals, especially the class of self-similar fractals. A fractal is de ned to be a set

64 FREDRIK STROMBERG

With r so chosen we see that B must be contained in exactly one ofthe intervals I(n)

j from the construction of C, but we want to obtain thesmallest such interval.

Let m be the largest integer k ∈ N s.t. B is contained in some I(k)j , but

B is not totally contained in any I(k+1)j , i.e. 1

3m+1 ≤ 2r < 13m ≤

13n = r0

It is clear that B ∩ C must be contained in a unique interval I(m+1)j ,

denote this interval by IB = [aB , aB + 13m+1 ]. Thus B∩C = IB ∩C, and by

the same argument as above IB ∩ C is similar to C, and if h denotes thatsimilarity transformation from C to IB ∩ C we can define h : C → C ∩ Bby

h(x) =1

3m+1x+ aB

Now, for x, y ∈ C we have

ar|x− y| = 13

2r|x− y| < 13m+1

|x− y| = |h(x)− h(y)|.

So the requirements of Theorem 27 are fulfilled, and thus from this the-orem we get that HDHC(C) <∞.

Now we are about to use what the two theorems have given us, i.e.that , if s = the Hausdorff dimension of the cantor set, the s-dimensionalHausdorff measure of C obeys the following inequality 0 < HDHC(C) <∞.

Clearly if we let S1(x) = 13x, and S2(x) = 1

3x + 23 as previously, we

have that C = S1(C) ∪ S2(C) (a disjoint union). The maps S1 and S2

are Lipschitz maps, with Lipschitz constants equal to 13 , so Proposition 18

implies that Hs(S1(C)) = Hs(S2(C)) = 13sH

s(C). So denoting DH(C) bys and summing what we know about the s-dimensional Hausdorff measurewe get

Hs(C) =Hs(S1(C)) + Hs(S2(C)) =

=13s

Hs(C) +13s

Hs(C)

=213s

Hs(C),

and since Hs(C) is different from ∞ and 0 we can divide it out and so get1 = 2 1

3s ,=⇒

s =ln 2ln 3

Example 11 (Koch curve).This example comes from Hutchinson [13] and first we will use it to show that twodifferent sets of maps can generate the same invariant set.

(a) We refer to Figure 5. Let a1, a2, , a3, a4 and a5 be as shown in the figure,and let S = S1, S2, S3, S4 where Si : R2 → R

2 is the unique similtudethat maps the vector −−→a1a2 to the vector −−−−→aiai+1 . Using Prop 1 it is easy tofind these maps explicitly ( just consider shrinking , moving and rotating

Page 65: 1. Intro · The main application of the theory is generating and passing information about fractals, especially the class of self-similar fractals. A fractal is de ned to be a set

ITERATED FUNCTION SYSTEMS , THE CHAOS GAME AND INVARIANT MEASURES 65

the vector −−→a1a2 (= the unit vector (1, 0) ∈ R2. Let(xy

)∈ R2.

S1((xy

)) =

(13 00 1

3

)(xy

), S2(

(xy

)) =

(16 − 1

2√

31

2√

316

)(xy

)+(

130

),(24)

S3((xy

)) =

(16

12√

3

− 12√

316

)(xy

)+( 1

21

2√

3

),(25)

S4((xy

)) =

(13 00 1

3

)(xy

)+(

230

)(26)

Now let S ′ = S′1, S′2 where S′1 : R2 → R2 is the unique similtude

mapping −−→a1a5 to −−→a1a3, having a negative determinant, and S′1 : R2 →R

2 is the unique similtude mapping −−→a1a5 to −−→a3a5, also having a negativedeterminant (this means that the orthogonal transformation component inthese similtudes have determinant -1, i.e. contains a reflection component).These maps can also easy be found explicitly.

S′1((xy

)) =

(12

12√

31

2√

3− 1

2

)(xy

),(27)

S′2((xy

)) =

(12 − 1

2√

3

− 12√

3− 1

2

)(xy

)+( 1

21

2√

3

)(28)

Now, the invariant set of S, AS is what is called the Koch Curve K. Thetwo sets of maps S and S ′ are different, but they have the same invariantset K.

There are many different ways to see this, of which I have used two.First you can do as I did first to verify this: use the Chaos Game to plot

approximate pictures of both sets in the same window, (this is how/whyFigure 5 was made) and then conclude that the two so constructed setsmust be equal by inspection.

Alternatively, notice that S′1(K) ∪ S′2(K) = WS′(K) = K. This can beachieved also by first plotting K, and then S′1(K), and S′2(K) in the samewindow just to realise that S′1(K) and S′1(K) are nothing but the left andright halves of K respectively. Thus their union is all of K. (This picturewill also look just like Figure 5 in B/W so I am not including it).

Then it is an easy exercise to see that S′iS′1 = S1, S′1S′2 = S2, S

′2S′1 =

S3, and S′1 S′2 = S4. Either by moving the vectors directly in your head(or on paper), or by using matrix multiplication, or even to plot the actionson [0, 1] of both sides of an equality sign. This means that the Hutchinsonoperators corresponding to the two sets WS and WS′ are related throughW 2S′ = WS . Therefore their invariant sets must be the same.If you think that drawing pictures does not prove anything about the

invariant sets, you have to come up with some mathematical proof.The usual way, (see for example [13]) of providing a ”real proof” is to note

that the invariant set is the closure of fixed points of finite combinationsSi1,...,ip of maps from S, so to ”prove” that AS = AS′ we just have to verifythat for every calculated fixed point si1i2...ip , of the map Si1i2...ip , where

Page 66: 1. Intro · The main application of the theory is generating and passing information about fractals, especially the class of self-similar fractals. A fractal is de ned to be a set

66 FREDRIK STROMBERG

ik ∈ 1, 2, 3, 4, k = 1, . . . , p, there exists a sequence j1, j2, . . . , jq, wherejk ∈ 1, 2, k = 1, . . . , q s.t. S′j1j2...jq (si1i2...ip) = si1i2...ip ,

Likewise for fixed points of combinations of maps in S ′ to find combina-tions in S with the same point as fixed point.

This procedure will give a proof of the equality of the attractors, but Ithink I’d rather make a nice picture.

(b) Then it might be interesting to calculate the fractal dimension of the KochCurve in the two ways we know of (cf. section 4.2).(i) Using the experimental ’ball counting’ definition of fractal dimension,

Def. 4.5 and the picture of K in Figure 5. Begin by letting r = 1, itis clear that it is only needed a single ball of radius = 1 to cover K,so N (K, 1) = 1. Now look inductively at the sequence rk = 1

3k. If

r = r1 = 13 we see that a single ball of radius r can not cover both a1

and a5, but two can, thus N (K, 13 ) = 2. By the appearance of the set

K as constructed out of pieces of the form

(but with edges replaced by smaller copies of itself etc.) we see thatwe need 2 · 4k balls with radius rk to cover K, and we conclude thatfor large k we have N (K, rk) = 2 · 4k. Thus we can now calculate thefractal dimension D(K).

D(K) = lim supε→0

lnN (K, ε)

ln(1ε )

: ε ∈ (0, ε)

=

= lim supk→∞

lnN (K, rl)

ln( 1rl

): k < l <∞

=

=

ln(N (K, rl)) = ln(2 · 4l) = ln(2) + l · ln(4), and

ln(1rl

) = ln(3l) = l · ln(3)

=

= lim supk→∞

ln(2) + l · ln(4)

l · ln(3): k < l <∞

=

ln(4)ln(3)

,

and as we could expect from the figure the dimension here, D(K) > 1indicating that the Koch curve consists of a continuum of points, thatis it is at least a line, but it is even a very rough line, so it has a greaterdimension than just a line.

(ii) That was the ”hard” way, relying on our geometrical feeling of cov-erings, let us now instead use the fact that all maps in S and S ′ aresimiltudes together with Theorem 28.To make use of this theorem we have to consider an IFS, (R2,S), andwe must show that this IFS is totally disconnected or just-touching.But by the construction of the maps in S it is easy to see, (by shrink-ing, rotating and translating by proper figures) that the intersectionsSi(K) ∩ Sj(K), ∀i 6= j ∈ 1, 2, 3, 4, is probably empty, but certainly

Page 67: 1. Intro · The main application of the theory is generating and passing information about fractals, especially the class of self-similar fractals. A fractal is de ned to be a set

ITERATED FUNCTION SYSTEMS , THE CHAOS GAME AND INVARIANT MEASURES 67

can not consist of anything more than one of the points a2, a3 or a4.So if we let O = K \ a1, a5 then O ⊂ K is open (in K), since weare removing a two point set, which are closed in K (since closed inR

2), and also the intersections Si(O) ∩ Sj(O), ∀i 6= j ∈ 1, 2, 3, 4is now empty, (since we have taken away the only points that trans-forms into a2, a3 or a4). Thus the IFS (R2,S) satisfies the open setcondition and either is totally disconnected or just-touching. So therequirements of the theorem are fulfilled, the IFS is hyperbolic andtotally disconnected or just-touching.Then note that the scaling factor of a similtude is given by the squareroot of the determinant of the corresponding matrix. Equation (24)gives us the following scaling factors : s1 = s2 = s3 = s4 = 1

3 .Theorem 28 implies that the fractal dimension D = D(K) is given bythe unique solution to the equation

N∑i=1

|si|D = 1,

and now∑Ni=1 |si|D =

∑4i=1( 1

3 )D = 4( 13 )D = 1, ⇒ ( 1

3 )D =( 1

4 ) ⇒ −D ln(3) = − ln(4) ⇒ D = ln(3)ln(4) .

Thus the theoretically calculated fractal dimension is equal to the ex-perimentally discovered D = D(K) = ln(3)

ln(4) .(iii) To calculate the fractal dimension of K we can of course use the maps

in S ′ as well. The same arguments as above apply, except that it isobvious that S′1(K)∩S′2(K) can at most contain the point a3. This isso since S′1 maps K to the part of K to the left of a3 with S′1(a5) = a3

and S′2 maps K to the part of K to the right of a3 with S′2(a1) = a3.Hence the only common point of their images is a3.We have that the IFS (R2,S ′) is a hyperbolic just-touching IFS, thatsatisfies the OSC. The scaling factors obtained from the matrices inequation (27) are s′1 = s′2 = 1√

3, and the fractal dimension are given

by∑2i=1 |s′i|D =

∑2i=1( 1√

3)D = 2( 1√

3)D = 1, ⇒ ( 1√

3)D =

( 12 ) ⇒ −D ln(

√3) = − ln(2) ⇒ D · 1

2 ln(3) = ln(2) ⇒ D ln(3) =2 ln(2) = ln(4), ⇒ D = ln(4)

ln(3) , and we are relieved by the fact that weagain obtained the same fractal dimension as the experimental one.

One thing that this example told us was that two different sets of maps couldgenerate the same invariant set.

References

[1] M.A. Armstrong, Basic topology, Springer-Verlag (UTM), 1983, first ed. by MacGraw-Hill,

1979.[2] Michael F Barnsley, Fractals everywhere, second ed. Academic press, 1993, first edition pub-

lished in 1988.[3] M. F. Barnsley and S. Demko, Iterated function systems and the global construction of

fractals, Proc. R. Soc. Lond. A 399 (1985), 243-275.

[4] Donald L. Cohn, Measure theory, Birkhauser Boston, 1993, first printed 1980.[5] J. L. Doob, Measure Theory, Springer Verlag, New York, 1994.

Page 68: 1. Intro · The main application of the theory is generating and passing information about fractals, especially the class of self-similar fractals. A fractal is de ned to be a set

68 FREDRIK STROMBERG

[6] Nelson Dunford and Jacob T. Schwartz, Linear operators, Part I: General theory, Intersciencepublishers inc., New York, 1958.

[7] P. Duvall, J. Keesling, The Hausdorff dimension of the boundary of the Levy dragon, preprint,

1991.[8] Lawrence C. Evans, Ronald F. Gariepy, Measure theory and fine properties of functions, CRC

Press, 1992.[9] Kenneth Falconer, Fractal Geometry - Mathematical Foundations and Applications, Wiley,

paperback ed. 1997, first ed. 1990.

[10] Kenneth Falconer, Techniques in fractal geometry, Wiley, 1997.[11] H. Federer, Geometric Measure Theory, Springer Verlag, New York, 1969.

[12] Erland Gadde, Stable Iterated Function Systems, Doctoral Thesis No. 4, Dept. of Math.University of Umea, 1992.

[13] John E. Hutchinson, Fractals And Self Similarity, Indiana Univ. Math. J. Vol. 30, No. 5

(1981), pp. 713-747.[14] Leif Abrahamsson and Sten Kaijser, lecture notes, 1993.

[15] Kennan Shelton, An introduction to iterated function systems, preprint, 1996.[16] Erwin Kreyzig, Introductory Functional Analysis with Applications, Wiley Classics Library

Ed. , Wiley 1989, first edition published 1978.[17] S.T. Rachev, Probability metrics and the stability of stochastic models, Wiley, 1991.[18] C.A. Rogers, Hausdorff Measures, Cambridge University Press, 1970.

[19] Walter Rudin, Functional Analysis, McGraw-Hill, 1973.[20] Walter Rudin, Principles of Mathematical Analysis, Third Ed., McGraw-Hill International

editions, Singapore, 1976.[21] George F. Simmons, Introduction to Topology and Modern Analysis, McGraw-Hill Interna-

tional editions, Singapore, 1963.[22] Wolfgang J. Thron, Topological structures, Holt, Rinehart and Winston, 1966.

[23] C. Akerlund - Bistrom, A generalization of the Hutchinson Distance and Applications, AboAkademi Reports on Computer Science & Mathematics, Ser. A. No 167,1995.

E-mail address: [email protected]

Page 69: 1. Intro · The main application of the theory is generating and passing information about fractals, especially the class of self-similar fractals. A fractal is de ned to be a set

ITERATED FUNCTION SYSTEMS , THE CHAOS GAME AND INVARIANT MEASURES 69

−1 −0.5 0 0.5 1−1

−0.5

0

0.5

1iteration no.5

−1 −0.5 0 0.5 1−1

−0.5

0

0.5

1iteration no.7

−1 −0.5 0 0.5 1−1

−0.5

0

0.5

1iteration no.11

−1 −0.5 0 0.5 1−1

−0.5

0

0.5

1iteration no.15

Figure 1. Some steps on the path (in H(X)) from a square to a dragon

Page 70: 1. Intro · The main application of the theory is generating and passing information about fractals, especially the class of self-similar fractals. A fractal is de ned to be a set

70 FREDRIK STROMBERG

Figure 2. Comparison of the deterministic method vs. the chaos game

C0C1C2C3C4

0 1192913

237989

Figure 3. Construction of Cantor middle-thirds set

Page 71: 1. Intro · The main application of the theory is generating and passing information about fractals, especially the class of self-similar fractals. A fractal is de ned to be a set

ITERATED FUNCTION SYSTEMS , THE CHAOS GAME AND INVARIANT MEASURES 71

C0.C1.C2.C3.C4.C5.C6.C7.C8.

Figure 4. Correspondence between code space and the Cantormiddle-thirds set , here the addresses of x and y are ωx = 0010 . . . ,and ωy = 100 . . .

Figure 5. Two copies of the Koch Curve