Progress in Probability and Statistics Vol. 10
Edited by Peter Huber Murray Rosenblatt
Birkhauser Boston· Basel . Stuttgart
Yuri Kifer Ergodic Theory of Random Transformations
1986 Birkhauser Boston' Basel· Stuttgart
Author:
Yuri Kifer Institute of Mathematics and Computer Science Givat Ram 91904 J erusalem/Israel
Library of Congress Cataloging in Publication Data
Kifer, Yuri, 1948-Ergodic theory of random transformations.
(Progress in probability and statistics ; vol. 10) Bibliography: p. 1. Stochastic differential equations. 2. Differentiable
dynamical systems. 3. Ergodic theory. 4. Transformations (Mathematics) 1. Title. II. Series: Progress in probability and statistics ; v. 10. QA274.23.K53 1985 519.2 85-18645
CIP-Kurztitelaufnahme der Deutschen Bibliothek
Kifer, Yuri: Ergodic theory of random transformations I Yuri Kifer. - Boston ; Basel ; Stuttgart Birkhauser, 1986.
(Progress in probability and statistics Vol. 10)
NE:GT
All rights reserved. No part of this publication may be reproduced, stored in a retrieval system, or transmitted, in any form or by any means, electronic, mechanical, photocopying, recording or otherwise, without prior permission of the copyright owner.
© 1986 Birkhauser Boston, Inc.
ISBN 978-1-4684-9177-7 ISBN 978-1-4684-9175-3 (eBook) DOI 10.1007/978-1-4684-9175-3
To my family
Table of Contents
In trodu ction.
1. General analysis of random maps.
1.1. Markov chains as compositions of random maps.
1.2. Invariant measures and ergodicity.
1. 3. Characteristic exponents in metric spaces.
II. Entropy characteristics of random transformations.
2.1. Measure theoretic entropies.
2.2. Topological entropy.
2.3. Topological pressure.
ill. Random bundle maps.
3.1. Oseledec's theorem and the
"non-random" multiplicative ergodic theorem.
3.2. Biggest characteristic exponent.
3.3. Filtration of invariant subbundles.
N. Further study of invariant sub bundles and characteristic
exponents.
4.1. Continuity of invariant subbundles.
4.2 Stability of lhe biggest exponent.
4.3. Exponential growth rales.
V. Smooth random transformations.
5.1. Random diffeomorphisms.
5.2. Slochastic flows.
1
7
7
13
26
33
33
67
82
88
88
99
115
130
130
135
140
156
156
175
Appendix.
A.1. Ergodic decompositions.
A.2. Subadditive ergodic theorem.
References.
191
191
200
208
Frequently used notations
B(M)- the Borel a-algebra of M.
[(M,N)-the space of continuous maps from M to N.
[k-class - continuous together with k-derivatives.
Dfthe differential of a map!
lZr-the expectation of a random variable r.
Jr -a space of transformations on M.
f- a random transformation with a distribution m.
F'- a random bundle map with a distribution n.
nf = fn C ••• c fl' nF = Fn C ••• c Fl ' D nf means the
differential of n f.
hp(f)- the metric entropy of f with respect to an invariant meas
ure p.
L(f)- the topological entropy of f.
I - the unit interval.
I1(M,'I7) - the space of functions g with J ig id'17 < 00
M
P = m'" or p = n ....
f fA j- the probability of A.
XA- the indicator of a set A i.e., XA(x) = 1 if x E A and = 0 for oth
erwise.
(J(M)- the space of probability measures on M.
rrm - l - the (m-l)-dimensional projective space.
IRm - the m-dimensional Eucledean space.
~ - the unit circle.
'tl'-a space of vector bundle maps.
TM - the tangent bundle of a smooth manifold M.
o = Jf- or 0 = 'tl' - .
• - the end of the proof.
Statement i.j - i denotes the section and j denotes the number of
this statement in the section. The Roman number at the begin
ning (for instance, III. 1.2) means the number of the chapter.
-1-
Introduction
Ergodic theory of dynamical systems i.e., the qualitative
analysis of iterations of a single transformation is nowadays a well
developed theory. In 1945 S. Ulam and J. von Neumann in their
short note [44] suggested to study ergodic theorems for the more
general situation when one applies in turn different transforma
tions chosen at random. Their program was fulfilled by S. Kakutani
[23] in 1951. 'Both papers considered the case of transformations
with a common invariant measure. Recently Ohno [38] noticed
that this condition was excessive. Ergodic theorems are just the
beginning of ergodic theory. Among further major developments
are the notions of entropy and characteristic exponents.
The purpose of this book is the study of the variety of ergodic
theoretical properties of evolution processes generated by
independent applications of transformations chosen at random
from a certain class according to some probability distribution.
The book exhibits the first systematic treatment of ergodic theory
of random transformations i.e., an analysis of composed actions of
independent random maps. This set up allows a unified approach to
many problems of dynamical systems, products of random
matrices and stochastic flows generated by stochastic differential
equations.
The precise set up is the followmg. Let Jr be a space of
transformations acting in a certain space M. Suppose that Jr
possesses some measurable structure so that one can consider Jr
valued random variables f which we shall call random transforma
tions or random maps. Of course, this means that f is a Jr-valued
-2-
measurable function defined on some probability space which,
actually, can be identified with Jr. A probability measure ttl on tr is
called the distribution of a random transformation f if for any
measurable subset r c If' the probability P!f E.: f! equals ... (f). The
deterministic case emerges when ... is supported by one point.
We assume that the measurable structure on If' is compatible
with a certain measurable structure on M in the sense that all
transformations from tr are measurable and all subsets
U : fx E: G! c If' are measurable, as well, provided GeM is
measurable and x E.: M. Then any sequence f1.fz, . .. of indepen
dent identically distributed (i.i.d) random transformations yields a
time homogeneous Markov chain Xn on M by means of composi
tions Xn = t". 0 ••• 0 f1XC where XC is a random variable on M
independent of all fi' i = 1,2, .... This motivates the question
about the conditions which enable us to represent a given Markov
chain by means of compositions of independent random transfor
mations of a certain type, i.e. given a family of transition probabil
ities P(x,·), x E.: M of some Markov chain on M does there exist a
probability measure ... on a certain space of transformations such
that ... !/: fix E: G! = P(x,G) for all x E.: M and any measurable sub
set GeM. Not much is known today about this problem. Only the
case of continuous transformations was settled by Blumenthal and
Corson [6]. We discuss their and related results in Section 1.1. The
representation we are talking about is not unique, in general. This
gives rise to the question what properties of Markov chains can be
studied by means of their representations as compositions of
independent random transformations.
In Section 1.2 we describe certain results concerning invariant
measures and ergodicity. Our exposition here is close to the paper
of Ohno [38]. Theorems of standard ergodic theory of Markov
chains which the reader can find, for instance, in the monographs
of Neveu [37] and Rosenblatt [41] are not represented in this book
-3-
since we restrict our attention to the facts genuinely connected
with representations of Markov chains by means of compositions of
independent random transformations.
In Section 1.3 we introduce certain characteristic exponents
for compositions of independent random maps which were defined
previously for the deterministic case in Kifer [27]. Here and in the
next chapter Kingman's subadditive ergodic theorem plays an
important role.
Chapter II is devoted entirely to the entropy characteristics of
random transformations. We define there the notions of topologi
cal entropy and pressure and prove their properties along the
lines of the deterministic theory (see Walters [46]). Although simi
lar entropies are known in the study of skew-product transforma
tions of dynamical systems (which was indicated to me by F.
Ledrappier), they emerge here very naturally as important charac
teristics of random transformations.
Chapter III deals with a generalization of Oseledec's multiplica
tive ergodic theorem to the case of random bundle maps which act
linearly on fibres of a measurable vector bundle. To explain our
results consider the partial case of smooth random maps f acting
on a Riemannian manifold M. Then the differential Df acts on the
tangent bundle of TM of M. One can conclude from Oseledec's mul
tiplicative ergodic theorem (see. for instance. Ruelle [43)) that if x
does not belong to some exceptional set then for almost every
choice of the sequence c.J = (/1 ..... In .... ) the limit
lim l...log IID,t.. 0··· a D/l~1I = fJ(c.J.~) n~- n
exists for every vector ~ from the tangent space TzM at x. where
11·11 is the norm generated by the Riemannian structure. The
-4-
random variable p(c.;,~) can take on only certain values called
characteristic exponents. Here CJ is a point of a probability space (1
which can be identified with the space of sequences A,/Z, . .. or,
which is the same, with the infinite product of spaces (~,m).
In genera!, the number p(c.;,~) depends non-trivially on both c.;
and ~. Modifying the method of Furstenberg and Kifer [17] we shall
show in Chapter III that under natural assumptions for any x out
side of some exceptional set and each ~ EO: TxM with probability one
the limit P(c.;,~) is not random i.e., it does not depend on c.;. More
over for those x there exists a filtration of non-random subspaces
and non-random numbers px(m) = pg(m) > pi(m) > ... > p;(x)(m)
such that P(c.;,~) = p~(m) provided ~ EO: ef~ " ef~+l. The dependence
of ef~ on x is measurable i.e., .J!i = !ef~J form measurable subbun
dIes which are Df-invariant for m-almost all f. This result specifies
which characteristic exponents can actually occur when starting
from deterministic initial vectors. The case of invertible random
bundle maps was treated in Kifer [28]. Similar results under more
restrictive conditions have appeared independently in Carverhill's
preprint [10].
In Chapter IV we study certain properties of the biggest
exponent px(m) and the subbundles ef~. We give conditions for sta
bility of px(m) under perturbations of m in the weak sense. We con
sider also the question of positivity of Px (m) which turns out to be
important in certain applications to the theory of Schrodinger
operators with random potentials. Here we obtain actually another
type of conditions yielding the positivity of the biggest exponents
for products of Markov dependent matrices. This question was stu
died by Virtser [45], Guivarch [18], Royer [42] and in a recent
paper of Ledrappier [34]. Our approach is similar to the original
-5-
Furstenberg's treatment [16] of the independent random matrix
case. We give also conditions which imply the continuity of all sub
bundles which are invariant with respect to .... -almost all bundle
maps. Surprisingly, it suffices to impose these conditions only on
the transition probability of the Markov chain Xn = fn 0 ..• 0 fIXe in
the base space M.
In Chapter V we apply the theory of previous sections to the
smooth situations, namely, to the case of random diffeomorphisms
and, in particular, to the case of stochastic flows whose study is
now becoming an important subject in the theory of stochastic
processes.
In Appendix we discuss the theorem on ergodic decomposition
and Kingman's subadditive ergodic theorem which we employ
several times in the main body of the book.
The connection between different parts of this books can be
described as follows. In Chapter I only Section 1.2 is essential for
the rest. Chapters II and III are independent. Chapters III, IV and V
should be read in their numerical order.
Many results of this book are new and some of them are not
published yet even in the periodic literature. The theory of ran
dom transformations is just being created, it did not take yet its
final form and there is still much to be done.
This book is addressed to mathematicians working in prr)babil
ity and (or) ergodic theory and can be read also by graduate stu
dents with some background in these areas.
lowe my interest in both probability and ergodic theory to my
teachers E. Dynkin and Ya. Sinai. The ideas of H. Furstenberg were
-6-
decisive for our joint paper [17] which was the basis for my gen
eralization of the multiplicative ergodic theorem presented in
Chapter III. 1 am grateful to M. Brin for a number of useful com
ments. While visiting the University of Warwick in Summer 1985
during the stochastic analysis symposium I have benefited discuss
ing the content of this book with colleagues. Conversations with P.
Walters were especially fruitful since they led to the improvement
in the exposition of Chapter II. M. Rosenblatt deserves a special
credit for initiating Birkhauser's invitation to write this book. My
thanks also go to Ms. D. Sharon for the proficient typing job and to
Birkhauser Boston inc. staff for the efficient cooperation.
-7-
Chapter L
General analysis of random maps.
In this chapter we study basic connections between composi
tions of independent random transformations and corresponding
Markov chains together with some applications.
1. 1 Markov chains as compositions of random maps.
This section will be rather expository since its subject stands
out against the main body of this book where we consider random
transformations as something already granted. Besides, not much
is known about representations of Markov chains by means of com
positions of random transformations. Still it seems proper to dis
cuss this matter at the beginning of this book.
Let P(x,') be a family of Borel probability measures on a topo
logical space M such that P(x, G) is a Borel function of x EM for
any G from the Borel a-field B(M) (i. e., the minimal a-field con
taining all open sets). We would like to pick up a probability meas
ure tn on the space Jf of Borel maps of M into itself such that
In!/: fx E Gl = P(x,G) (1.1)
for any x E M and G E B(M).
We shall view the measures P(x ,.) as transition probabilities of
a Markov chain Xn i.e, Xk + 1 has the distribution P(x ,.) provided
-8-
Xk = x. Then the relation (1.1) says that x;" , n = 1,2, ... can be
constructed by means of composition of independent random
maps f l .f2 .... , fn having the distribution ttl. Indeed, put
Xn = fn 0 ••• 0 fl Xc where Xc is an M-valued random variable
independent of f l .f2' .... Then, clearly, ~ is a Markov chain with
transition probabilities
P(x,G) =m!/:Jx E G! =P(x,G).
Concerning the existence of ttl satisfying (1.1) we assert
Theorem 1.1. If M is a Borel subset of a Polish space (i.e., a
complete separable metric space) then for any family of Borel pro
bability measures P(x ,.) as above one can define a probability
measure m on the space of Borel maps of M into itself satisfying
(1.1).
Proof. According to § §36-37 of Kuratowski [31] the space M
is Borel measurably isomorphic to a Borel subset of the unit inter
val n = [0,1]. This means that there exists a one-to-one Borel map
rp : M ... n such that r "" rp(M) is a Borel subset of nand rp-I : r .... M is
also Borel.
Next, for any point x EM define a probability measure on n by
P(x ,to) = P(x ,rp-I(to n r)) whereas to E B(n). For each x EM and
6) E n put
z(x ,w) "" infh' : P(x ,[0,7]) ~ 6)1. ( 1.2)
If w is fixed then z(-,w) is a Borel map from Minto n. Indeed,
(x : z(x,w) > a! = (x : P(x ,[O,a]) < w! =
fx : P(x ,rp-I([O,a 1 n r)) < wI and the last set is Borel since we
assume that P(x, G) is a Borel function of x for any G E B(M).
-9-
Suppose that "" : H -> M equals ip-l on r and maps H \ r into
some point Xo EM. Define f(w) '= "" 0 z(',w) then f(w) is a Borel map
of M into itself for each wE H. Therefore we obtain a map J from H
into the space Jf' of Borel maps from M into itself acting by the for
mula !Jw) = f(w). The map Jinduces a measurable structure on Jf' by fixing that a subset A c Jf' is measurable if TlA is a Borel sub
set of H. Notice that if z(-.w l) = Z(-,W2), i.e., z(x ,Wl) = z(x ,(2) for
each x EM, then z(-,w) = Z(',Wl) for all W E [Wl,W2]. Hence Tl maps
points of Jf' on subintervals (may be, empty or semi-open) of nand
so the points of Jf' are measurable.
Let mes denote the Lebesgue measure on n then
( 1.3)
is a probabilit.y measure on Jf' defined for any subset A c Jf' such
that TlA E B(n). It is easy to see that meslw: z(x,w) > al =
mes lw : P(x ,[O,a]) < wl = 1 - P(x ,[O,a]) and so
meslw: z(x,w) E [O,aJl =P(x,[O,a]). Hence for any t:.EB(n)
(1.4 )
Therefore
mes!w: f(w)x E C! = mes!w: z(x,w) E ",,-lcl = (1.5)
= p(x,,,-lC) = p(x,r n ",,-lC) =
= P(X ,ipC) = P(x, C)
for every C E B(M). This together with (1.3) gives (1. 1) complet
ing the proof of Theorem 1. 1. •
-10-
The idea to consider first random maps from Minto n belongs
to Blumenthal and Corson [6]. They were interested in representa
tions by means of continuous random maps. Their results require
additional assumptions both on M and the family of probability
measures P(x ,.), x EM. Denote by AM) the space of Borel proba
bility measures on M. Then we have
Theorem 1.2. Let M be a connected and LocaLly connected
compact metric space. Suppose that the map M -> AM) given by
x -> P(x,·) is continuous with respect to the weak topoLogy on AM).
If for each x E M the support of P(x,·) is aLL of M then there exists
a probability measure an on the space [(M,M) of continuous
transformations of M satisfying (1.1).
The proof relies upon the following rather tricky topological
result which we formulate here without proof. For details we refer
the reader to the original paper of Blumenthal and Corson [6].
Proposition 1.1. Let n(M) and po(n) be the subspaces of AM)
and An) consisting of those measures whose support is all of M
and aLL of n, respectively. If M satisfies the conditions of Theorem
1.2 then there exist a continuous function ';j from Po(M) to po(n)
and a continuous function 1/1 from n to M such that 1f;(tJL) = JL for
all JL in Po(M).
Assuming this result to be true the proof of Theorem 1.2.
proceeds as follows. As in (1.2) we define
z(x,CJ) ~ inff7: tP(x,·)([O,-y]) ~ CJl· ( 1.6)
It is easy to see that for each fixed CJ En the relation (1.6) gives a
continuous map of Minto n. Then f(CJ) ~ 1/1 0 z(-,CJ) continuously
maps M into itself. Besides, as in Theorem 1.1 one obtains
-11-
= cpP(X,)('IjI-1G) = P(X,G).
This completes the proof by the same argument as in Theorem 1.1.
Remark 1.1. Actually, Blumenthal and Corson [6] studied a
more general problem of representation for families P(X, G) when
x belongs to one space m1 and G is a subset of another space m2
by means of random maps from m1 into m2. Other results of this
kind concern totally disconnected spaces (see [7]).
We shall not discuss here the topological requirements
imposed on M. On the other hand notice that the condition on the
support of measures P(x ,.) cannot be just dropped unless we are
ready to sacrifice the continuity of the constructed random map.
Indeed, consider, for instance, the case M = H. Then the relation
(1. 7)
does not necessarily define a continuous map of H into itself if the
function g(7) = P(6,[O:y]) is not strictly increasing. In this case
the graph of g (7) has a flat piece [a1,a2) and if f(a1)(6) = 71 then
f(a1) can map points close to 6 to any point between a1 and a2·
More questions arise when one wants to obtain representations
by special classes of transformations. In connection with the
theory of entropy in Chapter II it seems important to have condi
tions which enable us to obtain representations by means of
transformations preserving the same measure on M (i.e., each
IE Jf" should leave invariant a fixed measure on M). Other prob
lems concern representations by means of smooth maps, one-to
one maps, homeomorphisms etc. i.e., when JF' is one of these
-12-
classes of transformations. Actually, I do not know any general
conditions yielding such representations except for some triviali
ties concerning the interval and the important case of stochastic
flows generated by stochastic differential equations which we shall
study in Chapter V. Some results can be obtained also if one seeks
a measure ttl satisfying (1.1) with support on some finite dimen
sional groups of transformations, say, matrix groups.
The uniqueness of such representations hardly can be
expected. It is important to understand which properties of the
related Markov chain x:,. introduced at the beginning of this sec
tion do not depend on the representation. The following example
shows that different representations of the same family P(x ,.) may
yield rather different behavior of compositions of independent ran
dom maps.
Example 1.1. Let M be the unit circle 0. Suppose that P(x,')
for each x coincides with the normalized Lebesgue measure on 0. For any c.J E [0,1] define f (c.J)x "" e 2ni"'x where x is considered as a
complex number with Ix I = 1. Clearly,
mes{c.J: f(c.J)x E Gl = P(x,G). Another representation of the same
family is given by {(c.J)x "" e 2ni"'x2. Let f 1.f2' ... be independent
random transformations having the same distribution as f and
{1.f2' ... be the corresponding object for f. Then the compositions
fn 0 ••• 0 f1 preserve distances between points. On the other hand
the compositions fn a ... a f1 locally increase distances exponen
tially fast in the obvious sense. This difference becomes crucial
when one studies entropies and characteristic exponents of ran
dom maps.
-13-
1.2. Invariant measures and ergodicity.
This section exhibits basic connections between compositions
of independent random maps and deterministic ergodic theory
which lay foundation for the subsequent exposition.
Let, again, If' be a set of transformations acting on a space M.
We always assume that both If' and M possess some measurable
structures (i.e., some fixed a-fields of subsets called measurable
sets) such that the map If' x .M ~ M defined by (j,x) ~ fix is measur
able with respect to the product measurable structure of If' x M.
Denote by m some probability measure on If' which makes (If' ,m) a
probability space. Introduce a new probability space
(O,p) == (If'~,m~) as the infinite product of the copies of (If',m). The
points of {} are the sequences c.J = C/l ,/z , ... ), A Elf' and the meas
ure p is generated by the finite dimensional probabilities
where c.J(e.) = Ie denotes the e. -th term of the sequence c.J.
Define a shift operator 11 on {} by
(2.1 )
Introduce a sequence of If' -valued random variables f 1.f2' on {}
in the following way
Then, clearly f 1.f2' . .. are independent and have the same distri
bution m. We can also write
-14-
(2.3)
Let us remark at once that all results of this book do not
depend on the specific representation (2.2) but only on the distri
bution m. We may consider an alternate approach when a sequence
f l ,f2, . .. of independent JF" -valued random variables defined on
some probability space (A,p) is already given. All of them have the
same distribution m i.e., plfi E CPt = m(CP) for any measurable
cP c JF" and i = 1,2, .... Now we can define a map rp of A into the
space 0 of sequences r.l = (/t./2' ... ) acting by the formula
,,(A) == r.l = (r.l(1),,,,(2), ... ) with ",(i) = fi(A), i = 1,2, .... We shall
say that reO is measurable if rp-lr c A is measurable. The proba
bility p can be introduced on 0 by p(f) = p(,,-lf). Moreover we can
define fi (",) = fi(rp-l",). This definition is correct and we obtain
again (2.2) and (2.3).
Next we define a skew product transformation T acting on
Mx 0 by
T(X ,r.l) = (f(",)x ,11"'). (2.4)
Denote nf(",) = fn (",) 0 ••• 0 f l (",) then in view of (2.1) - (2.4),
(2.5)
If g is a function on M x 0 and f: M -+ M we write also
go" ,g 0 T and g 0 Ito denote the functions g 0 "(x,r.l) = g(x,11",),
g oT(X,r.l) =g(T(x,,,,»andg o fix,,,,) = g(!x ,r.l),
As we already mentioned it in the previous section :x,... == n:rxo, n = 1,2, ... forms a Markov chain provided Xc is an M-valued ran
dom variable independent of all random maps f l ,f2, .... The tran
sition probability P(x, G) of Xn can be expressed by the formula
-15-
P(x,G) = tn!!: fir E Gl = f XC(fir)dtn(~ (2.6)
where Xc is the indicator of a set G. Before proceeding any further
we must prove the following (d. Neveu [37], Proposition III. 2. 1).
Lemma 2. L For any x EM the relation (2.6) defines a proba
bility measure and for each measurable GeM the function P(x, G)
is measurable in x.
Proof. Denote by F the map Jf" x M ~ M given by F(/.x) = fir. Recall that F is assumed to be measurable. If G is a measurable
subset of M then
is also measurable as a section of the measurable set rIG in the
product Jf" x M. Hence (2.6) defines P(x, G) for any measurable G.
The u-additivity of P(x ,.) follows from the u-additivity of tn.
It remains to show the measurability of P(x, G) in x. For any
measurable f c Jf" x M denote by rx its x-section i.e.,
fx = if: (/.x) E fl· Then
(2.7)
Consider the class y of all measurable sets r c Jf" x M such that
tn(fx) depends measurable on x. This class contains all sets of the
form 1> x Q since tn!(1) x Q)x! = tn(1))XQ(x). Moreover
Thus the class y contains the whole algebra generated by all
-16-
product sets <I> x Q. Using monotone limits one concludes from
here that >¥ contains the minimal a-field generated by all product
sets and so >¥ coincides with the a-field of measurable sets in the
product If" x M. By (2.7) and the measurability of F this completes
the proof. •
Let P be the transition operator of the Markov chain Xn acting
on bounded measurable functions by the formula
P(g)(x) = £ g (y )P(x ,dy) = , g (Ix )dtn(~ (2.8)
The second part of Lemma 2.1 actually claims that the operator P
sends measurable functions into measurable functions. Indeed,
since P(x, G) = PXG(x) then Lemma 2.1 establishes this fact for
indicators of measurable sets. Taking limits of linear combinations
of indicators we can extend the result to all measurable functions.
This follows also from Fubini's theorem since if g is a measurable
function on M then g(x,~ = g (Ix) = g 0 F(x,~ is a measurable
function on M x If" and so
Pg(x) = J g 0 F(x,~dtn(~.
The adjoint operator p' acts on measures in the following way
p' p( G) = J dp(x )P(x, G). (2.9)
A measure p E PCM) is called p' -invariant if p' P = p. p' -invariant
measures will be important in our study. Usually, in ergodic
theory one takes an invariant measure as granted. Still, if we want
to be sure that at least one such measure exists we must require
more than just measurability.
Lemma 2.2. Suppose that M is a metric space and ttl is
-17-
concentrated on the set of continuous maps of M into itself. Then
the operator P takes bounded continuous functions into bounded
continuous functions. If. in addition. M is compact then there
exists, at least, one p' -invariant probability measure on M.
Proof. Take a bounded continuous function g on M and con
sider cf>~ (x) = !/ E ~ : fcontinuous and I g (ftc) - g (/y) I < } for all
y satisfying dist (x ,y) < l...1. Since F : (z.~ ~ fo is measurable it is n
an easy exercise to check that cf>~ (x) is a measurable set. If
dist (x ,y) < l... we can write n
IPg(y)-Pg(x)1 I f (g (/y )-g (ftc ))dm(~ I .p~(x)
~ } + m(~ \ cf>~(x))sup Ig I
which tends to zero as n ~ ex> since cf>~(x) t ~ in view of continuity
of g and m-almost all! Hence for some n (x) and any n ~ n (x) one
has I Pg (y) - Pg (x) I < E; proving the continuity of Pg.
To get a p' -invariant measure consider an arbitrary measure 1 n-l
TI E AM) and take TIn := - ~ (p·)k Tl · If M is compact then the n k=O
space P(M) is also compact (see, for instance. Rosenblatt [41]) and w
so the sequence TIn has converging subsequences. But if 1}n; -.. P
then
On the other hand for any continuous g,
-18-
!gdP'TJn, =! PgdTJn, .4 ! Pgdp = !gdP'p '!.---w
since Pg is also continuous. This means that p' 'fIn, 4 p' P and so
p' p = p i.e. pis p' -invariant .•
Remark 2.1. The assumption that P takes continuous func
tions into continuous functions is, of course, the same as the con
dition of Theorem 1.2 that the map x 4 P(x,' ) of M into AM) is
continuous when AM) is considered with the topology of weak con
vergence. We shall say that J-£ E AM x 0) is T-invariant if
! g(x,w)dJ-£(x,w) = !g(T(x,w))dJ-£(x,w) (2.10)
for any bounded measurable function g on M x O.
Lemma 2.3. (Ohno [38]) A probability measure p is p'
invariant if and only if J-£ = P x pis T-invariant.
Proof. For any bounded measurable function g on M put
g(x) = J g (x ,w)d p(w)
then by (2.4), (2.8) and (2.9),
Jg(x)dP'p(x) = jPg(x)dp(x) =
=!! Jg(/x,w)dp(w)dm(/Jdp(x) =
= J Jg(/x:fJw')dp(w')dp(x) =
= !g (T(X ,w'))dJ-£(x,w')
-19-
where GJ' is the sequence (!w(1),w(2), ... ). Hence p. p = P if and
only if 11 is T-invariant. •
Next we shall discuss the ergodicity of p and p x p. We shall say
that a bounded measurable function 9 is (P,p)- invariant if
Pg = 9 p-almost surely (p-a.s.) A p. -invariant probability meas
ure p E AM) is called ergodic if any (P,p )-invariant function is p
a.s. a constant. Similarly, we shall call a T-invariant measure
JL E AM x 0) ergodic if for any bounded measurable function h on
M x 0 the relation haT = h, JL-a.s. implies h =0 const JL-a.s. Usu
ally one defines the ergodicity through invariant subsets which are
required to have the full or zero measure. First, we shall check
that these definitions are the same. This is obvious concerning
the deterministic transformation T and we leave the proof to the
reader. As to (P,p)-invariance we claim
Lemma 2.4. Call a set A eM, (P,p)-invariant if XA is a
(P,p)-invariant function. Then the following two conditions are
equivalent:
(i) a p. -invariant measure p E AM) is ergodic;
(ii) any (P,p )-invariant set A c M has the p-measure equal
zero or one.
Proof. The (i) => (ii) part is evident. To prove the (ii) => (i)
part we shall borrow an idea from Rosenblatt [41] pp. 92-93. We
shall show that if 9 is a (P,p )-invariant function then
Ix : 9 (x) > a I is a (P,p)-invariant set for each real number a and
so if each of these sets has the p-measure equal to zero or one
then 9 is a constant p-a.s. To do this take a (P,p)-invariant func
tion g. Then I 9 I is also (P,p )-invariant. Indeed,
Igl = IPg I ~ Pig I· But J(Plg 1 - Ig I)dp) = 0 since p is p.
invariant. Hence Pig I = I 9 I p-a.s. But then
max(O,g) = t!g + Ig I! is (p. ,p)-invariant. Furthermore, if 9 1,g2
-20-
are (P,p)-invariant then
and
are (P,p)-invariant. Of course, 1 is invariant. Then
min(n max(O,g -a), 1), n = 1,2, ...
is a sequence of (P,p)-invariant functions. The limit as n ~ 00 of
this sequence is the indicator function of the set !x : g (x) > a l which is hence invariant. •
The following result was initially proved by Kakutani [23] for
the case of transformations with a common invariant measure and
then by Ohno in the general case. Our proof is different from
their's.
Theorem 2.1. A measure p E: P(M) is ergodic if and only if
p x p is ergodic.
Proof. The "if" part is simple. Indeed, suppose that Pg = g p
a.s. and g of const p-a.s. Then there exists a number C such that
the set G = !x : g (x) ~ Cl has the p-measure different from 0 and
1. As we have seen it in the proof of Lemma 2.4,
xaCx ) = PxaCx ) = J XG(fo; )dtn(/J for p-almost all x. This relation
says that if xaCx) equals 0 or 1 then xaCfo;) equals 0 or 1, respec
tively, for tn-almost all f. Therefore XcCT(X,c.J)) = XG(c.J(l)x) = xaCx)
p x p-a.s. Now if p x p is ergodic then XG == const p-a.s. which con
tradicts our assumption that 0 < p( G) < 1. Hence g == canst p-a.s.
-21-
and so p is ergodic.
To prove the "only if" part suppose that p is ergodic and
hOT=h p x p-a.s. (2.11)
where h is a bounded measurable function on M x O. By ( 2.2) the
function h (x , w) considered as a random variable in w E 0 can be
written as
(2.12)
Thus (2.4) and (2.11) imply
p x p-a.s. for any m = 1,2, .. '. Put ho(x) = (1;h(x,(f 1, ... )) where
(1; is the expectation on the probability space (O,p). Since f 1.f2, ...
are independent then by (2.13),
= Pho(x) p-a.s.
where (1;(-1') is the conditional expectation. Hence
ho(x) == C = canst p-a.s. Similarly, if
then
-22-
(2.14)
p x p-a.s. Here, as usual, the conditional expectation
~(- i fl ' ... , fm) means the conditional expectation with respect to
the u-field .:;zm generated by the sets of the form
!w : fl(w) E r l ' ... , fm (w) E r m!. Let .:J be the minimal u-field con
taining all .:Jm then (2.14) means that for p-almost all x the func
tion h (x ,(f 1.f2' ... )) depends only on the tail u-field
.:Joo = nC]\ ~). Since f t .f2' ... are independent then by zero-one m
law (see, for instance, Neveu [37]) the u-field .:Joo is trivial. There
fore h(x ,(f t .f2 , ... )) = c p x p-a.s. This is true for any bounded
measurable function h satisfying (2.11) and so p x P is ergodic .•
Remark 2.2. The final arguments of the above proof imply
also that if h is a function on 0 i.e., it is independent of x and
h a 1'J = h then h = canst p-a.s. since in this case h must depend
only on the tail u-field .:;zoo.
Corollary 2.1. Let T/ E PCY.) be a p. -invariant but not neces
sarily ergodic measure and h be a measurable function satisfying
( 2.11). If T/ can be represented as an integral
T/ = Jpda(p) {2.15}
over the space of p. -invariant ergodic measures then
(2.16)
Proof. The set All. = !h(x,w) T- Jh(x,w)dp(w)! is measurable
and so by the "only if" part of Theorem 2.1 one concludes in view
of (2.11) that p x p(An) = 0 for any ergodic p. Hence
T/ x p(An) = Jp x p(AT))da(p) = o .•
-23-
A representation of the form (2.15) is called an ergodic decom
position of." This question was studied in a number of papers. Still
I do not know any readily available reference where the result is
proved in a convenient for our purposes form. Because of this rea
son we shall discuss this problem in Appendix. We shall prove
Proposition 2.1. The representation (2.15) is always possible
provided." is a Borel p. -invariant measure on a Borel subset M of
a Polish space considered with its Borel measurable structure.
Remark 2.3. The above result does not actually use the
topology. Thus it suffices to assume the existence of a measurable
together with its inverse one-to-one map (i.e., a measurable iso
morphism) between M and a Borel subset of some Polish space. If
we are interested in a representation of just one measure." then
this isomorphism may hold up to some set of .,,-measure zero. In
this case (M,.,,) is called a Lebesgue space (see Rohlin [40]).
Theorem 2.1 and Corollary 2.1 are interesting mainly by their
consequences for ergodic theorems.
Theorem 2.2. (Random sub additive ergodic theorem). Let
." E P(M) be a p. -invariant measure, and hn , n = 1,2, . .. be a
sequence of measurable functions on M x n satisfying the follow
ing conditions:
a) integrability: hi == max(h 1,0) E (b,1(M x n,." x p) (i.e.,
fhi d." x p < 00);
b) subadditivity: hn+m ~ h m + h n a Tm, T/ X p-a.s.
Then there exists a measurable function h on M x n such that
h + E (\,,1(M x 0,." x p),
." x p-a.s., (2.17)
-24-
lim 1... hn = h n ..... oo n
." x p-a.s. (2.18)
and
lim 1... fhn d." x P = inf 1... fhndT/ x P = fh d." x p. (2.19) n-". n n n
If all hn' n = 1,2, . .. are independent of x then
h == const p-a.s. If, the conditions of Corollary 2.1 are satisfied
then h depends only on x and so h(fo:) = h(x), TJ x m-a.s. Jnpartic
ular, if TJ is ergodic then h == const .,,-a.s.
The first part of this theorem, i.e., the existence of h satisfying
(2.17)-(2.19), is a version of Kingman's subadditive ergodic
theorem. For the reader's convenience we shall prove it in Appen
dix. Employing, in addition, Theorem 2.1, Remark 2.1 and Corollary
2.1 we obtain the remaining assertions.
Corollary 2.2. (Random ergodic theorem) Let ." E AM) be
p' -invariant, and h be a measurable function such that
h' E 111(M x 0,." x p). Then there exists a measurable function Ii on M x 0 such that Ii 0 T = fL, ." x p-a.s.,
(2.20)
and
." x p-a.s. (2.21)
If ." has an ergodic decomposition then fL is a function of x only.
In particular, if." is ergodic then fL = fhd.".
-25-
This result follows immediately from Theorem 2.2 if we put n-l
h n = I; h a Tk. Then the inequality in item b) becomes an equalk=O
ity.
Remark 2.4. Corollary 2.2 was proved by Kakutani [23] in the
case when all IE supp m preserve the same measure on M. Ohno
[38] noticed that this condition was too strong. Both Kakutani and
Ohno did not pointed out that under mild assumptions h is
independent of Co) i.e., it is a function of x only.
The following is a version of the ergodic theorem for stationary
Markov chains.
Corollary 2.3. Let T/ E AM) be p' -invariant and h + E (b(M,T/).
Suppose that the conditions of Corollary 2.1 are satisfied and XO is
a M-valued random variable having the distribution T/ and
independent of all f 1,f2, .... If Xn = nf XO, n = 1,2, ... then with
probability one
n-l ~
lim n L h(Xn) = h(XO) k-O
(2.22) n ......
where h is the same as in Corollary 2.2.
Proof. Taking XO(x) = x we can realize XO on the probability
space (M,11). Then the sequence XO,fl,f2, ... can be considered on
(M x 0,11 x p). In this interpretation the assertion (2.22) applied to
a function h depending only on x is equivalent to (2.23) .•
Remark 2.5. Ohno [38] studied also some mixing properties
of random transformations.
-26-
1.3 Characteristic exponents in metric spaces.
In this section we assume that an is a probability measure on
the space Jf of continuous maps of a metric space (M,d) into itself.
Let f t ,f2, ... be a sequence of independent an-distributed Jf-valued
random variables on the space (G,p) introduced in the previous
section. Define the following family of metrics
(3.1)
where, again, kf = fk 0 •.• 0 f t and of == id is the identity map. Sup
pose that M has no isolated points then all sets
B~(x,w) =!y EO M'\ x : d':(x,y) ~ o!
are non-empty for any 0 > O. Denote
~~~)x ,nf(w)y) d (x ,y)
(3.2)
(3.3)
The following is a "random" version of Theorem 1 from Kifer
[27].
Theorem 3.1. Suppose that TJ EO AM) is a p' -invariant meas
ure in the sense of the previous section satisfying
(3.4)
Then for TJ x p-almost all (x,w) there exists
A.s(x) = lim 1 log~(x ,w) n~GD n
(3.5)
-'l:7-
Under the conditions of Corollary 2.1 the function A6(x) is non
random i.e., it is independent of w. Furthermore
(3.6)
and so if T) is ergodic then A6(x) is equal to a constant T)-a.s.
If {3.4} holds for all li E (O,lio) with some lio > 0 then there
exists
A(x) = lim A6(x) p-a.s. 6 ... 0 {3.7}
Proof. Notice that B.,f+m(x,w) c B.,f(x,w) and
B.,f+m(x,w) c (nf(w))-lB~(Tn(X,W)) where T is defined by (2.4).
Since
~~m f~x ,n+m f(CJ).1Ll _ d (nf(CJ)x ,nf(w).1Ll d(n+mf(w)x ,m f(w)z) d(x,y) - d(x,y) d(nf(w)x,z)
with z = nf(w)y, then
where ~ is given by (2.3). Therefore log A.,f(x, CJ) satisfies the subad
ditivity condition of Theorem 2.2. The integrability condition also
holds in view of (3.4). Hence the application of Theorem 2.2 yields
(3.5) and (3.6). Clearly, A.,f(x ,w) decreases when li .j,. 0 and so does
A6(x). Hence the limit (3.7) exists, as well. •
Remark 3.1. We call A(x) the maximal characteri.stic
exponent at x. The reason for that can be explained by Theorem
-28-
3.3 together with the results of Chapter III.
Remark 3.2. The assumption (3.4) holds if, for instance, f(c.J)
p-a.s. satisfies the Lipschitz condition with a constant K(c.J) such
that f log+K(c.J)dp(c.J) < 00.
Similar quantities can be introduced for a f-invariant set G
which means that!G c G tn-a.s. Define
B:(G,c.J) = (y EM '\ G: max d(lcf(c.J)Y ,G) ~ oj (3.9) OslcSn-l
where d(x,G) = inf d(x,y). Set yEG
~(G,c.J) = sup ~f(~lI...L8.. YE~(G.OJ) d(y,G)
Theorem 3.2. Let G be a f-invariant set and
Then there exists a non-random Limit
Ao(G) = lim llog ~(G,c.J) p-a.s. n .... - n
If (3.11) is true for all 0 small enough then there exists
Proof. In the same way as in (3.8),
(3.10)
(3.11)
(3.12)
(3.13)
-29-
This inequality together with (3.11) says that the sequence
log A~(G,w), n = 1,2, ... satisfies the conditions of Theorem 2.2
which implies (3.12). Since this sequence depends only on CJ then
this limit equals a constant p-a.s. Clearly, ~(G ,CJ) decreases when
0+ 0 and so does i\/j(G). This implies (3.13) .•
Remark 3.3. The Lipschitz condition of Remark 3.2 yields
(3.11), as welL
The number i\(G) is connected with the stability properties of
an f-invariant set G. We shall say that G is stable if for each l: > 0
there is 0 > 0 such that y E B:'(G,w) p-a.s. for all n = 1,2, ... pro
vided d(y,G) ~ O. If, in addition, d(nf(CJ)y,G) ~ 0 p-a.s. as n ~ 00
then we shall call G asymptotically stable.
One obtains immediately from the definitions
Corollary 3.1. If i\( G) < 0 and the f-invariant set G is stable
then it is asymptotically stable.
Proof. Since i\( G) < 0 there exists l: > 0 such that
A,,( G) ~ ~ A( G) < O. For this l: choose 0 as in the definition of sta
bility. Then B:'(G,CJ)::)B/j(G)=fy:d(y,G)~o! and for any
n ~ no(w), exp( ~A( G)n) ~ A,;,( G,w) ~ y;}j}(G) d (~f«;~) G) ~ 0 as
n -+ 00 .•
Next we are going to compute A(x) from (3.7) in the smooth
case. Namely, let M be a compact Riemannian manifold, and m be
a probability measure on the space of smooth maps of M into itself.
Consider a sequence f 1.f2' ... of independent smooth maps having
the distribution tn. Denote by D I the differential of a map I
Introduce the norms
-30-
I - l~L II D II x - 0 .. ~~f.1I II ~ II (3.15)
where TxM is the tangent space at x and we suppose that some
Riemannian norm of vectors is already chosen. Now we can state
Theorem 3.3. Suppose that supp ttl is compact in 1C 1 topology
and p E AM) is a p. -invariant measure. Then
A(x) = lim llog II Dnf(c.J) Ilx p x p-a.s. n ...... ao n (3.16)
where A(x) is defined by (3.7).
Warning: D n f is the differential of n f and not the n -th
differential of f.
Proof. It is easy to see that
(3.17)
Indeed, let ~ E TxJ.f, II~II = 1 and IIDnf(c.J)~llx = II Dnf(c.J) Ilx. If Expx : Tx -> M is the exponential map then, clearly,
lim 0;-.0
that implies (3.17).
Since both M and supp ttl are compact then there exists a non
random function Qn(6) > 0 such that Qn(6) -> 0 as 6 -> 0 and for
any x EM, n > 0 and y E B~(x,c.J),
-31-
Fixed nand (; > 0 one can find [; > 0 such that if y E: B';'(x ,w) then
y = Expx (pI;) , for some I; E: Tx Y, 0 < p ~ [; and Expx (u 1;) E: B~(x ,w)
for all u E: [O,p]. Hence by (3.18),
Thus
p
d(nf(w)x,nf(w)y) s;; J IIDnf(w) II EXP.(tJ.<")du a
(3.19)
Recall that [; may depend on n and so (3.19) does not imply
directly the desired result. But since the sequence log ~(x ,Col) is
subadditive then by (2.19), (3.5) and (3.7)
Since C(n (p) ~ 0 as li ~ 0 one has
JA(x)dp(x) ~ 1.... flog II D n f(CoI) Ilx dp(x)dp(CoI). (3.20) n
On the other hand by (3.17),
A(x)~limsup 1.... log II Dnf(CoI) IIx pxp-a.s. (3.21) n .... oo n
-32-
It is easy to see that (3.20) and (3.21) yield (3.16) .•
Remark 3.4. The assumption on supp ttl to be compact can
be relaxed to some integrability condition on log+IIDf1(c.»llx and
the modulus of continuity of Dfl(c.».
-33-
Chapter ll.
Entropy characteristics of random transformations.
The concept of entropy has played a major part in ergodic
theory so far. In this chapter we introduce the notions of
both' measure theoretic (metric) and topological entropies for
compositions of random maps. These entropies turn out to be
the "mixed" or "relative" entropies of Abramov-Rohlin [1] and
Ledrappier-Walters [32] corresponding to the skew product
transformation T but our motivation and the set up are different
from theirs. We shall review facts from the deterministic theory of
entropy. More comprehensive expositions can be found in Martin
and England [36], Peterson [39] and Walters [46].
2.1 Measure theoretic entropies.
We shall start with the introduction to the standard theory of
entropy which is well known but still it is not common in the pro
babilistic literature. Let M be a space with a given a-field B of
measurable sets and J1. be the probability measure on M. We shall
need the following notions.
Definition L L
a) A partition of M is a disjoint collection of elements of B whose union is M;
b) If ~ and.,., are two finite partitions of M then we write ~ -<.,., to mean that each element of ~ is a union of elements of.,., (i.e., .,.,
is a refinement of ();
c) Let ~ = fAI ' ... , An!. 7J = f CI ' ... , C.d be two finite parti
tions of M. Their join is the partition
Ir:
We shall write also \1 ~, == fl v ~2 v ... v ~Ir:. ~=1
d) If 'I : M -+ M is a measurable map and ~ = fA 1 ' ... , Air: I is
a partition then rp-1~ denotes the partition frp-1A l' ... , 'I-lAir: I. e) If Al and -4 are sub-a-fields of B then Al v A2 will
denote the minimal a-field containing both Al and A. If
'I: M -+ M is a measurable map and A c B is a a-field then rp- 1A denotes the a-field whose elements are 1'1-1 A ,A E AJ.
Let A be a sub-a-field of Band p. E AM). Recall (see Neveu
[37]) that for g E f1 1(M,p.). the conditional expectation Ep.(g 1-4) of
g given A is an --4-measurable function on M which satisfies
j Ep.(g l-4)dp. = jgdp. A A
for any A EA. The conditional probability of a set B E B is
In what follows log a will always mean the natural logarithm of
a and the expression 0 log 0 will be considered to be o.
Definition 1.2. Let p. be a probability measure on M and
f = fA 1 •...• AoI: I be a finite partition of M. The conditional
entropy of ~ given a a-field of A c B is the num ber
-35-
If ..A = fM ,ifJl is the trivial (J-field then we shall get the entropy
Hp.W of~.
Remark 1.1. Hp.(~ IAl ~ 0 since JL(~ IAl ~ 1 Wa.s.
For any finite partition 71 = IG 1, ... , Gel of M we shall denote
by:J(71) the (J-field generated by 71 i.e. the collection of unions of
elements of 71. Clearly,
omitting the j -terms when JL( Gi ) = o.
The following elementary fact implies several useful properties
of entropy.
Lemma 1.1. The function
{o if x=o t(x) = x log x if x 7c 0
is strictLy convex, i.e.,
Ie Ie £( ~ (Xixi) ~ ~ (Xi£(xi) ( 1.2)
i=l i=l
Ie if Xi E [0,00), (Xi ~ 0, E (Xi = 1; and equality holds only when all
i=l
the xi' corresponding to non-zero (Xi' are equal. Moreover for any
a-field ..A c B.
( 1.3)
provided JL is a probability measure on M and g ~ ° is a function
-36-
on M. The equality in (1.3) holds only when 9 == const J.L-a.s.
Proof. It suffices to prove (1.2) for k = 2 since then (1.2) will
follow for any k by induction. Fix a,p with a > 0, p > 0 and
a + p = 1. Suppose y > x. By the mean value theorem
t(y) - t(ax + f3y) = t'(z)(y -x)a
for some z with ax + py ~ z ~ y and
t(ax+py) - t(x) = t'(w)(y-x)P
.. 1 for some w with x < w < ax + f3y. Since t (x) = - > 0 on (0,00)
x then t'(z) > t'(w) and so
(t(y) - t(ax + f3y))f3 = t'(z)(y-x)af3 >
> t'(w)(y-x)ap = (t(ax + f3y) - t(x))a.
Therefore t(ax + f3y) < a t(x) + f3 t(y) if x,y ;?: O. It clearly holds
also if x,y;?: 0 and x cF- y. Now (1.3) follows from (1.2) provided 9 e
is a simple function i.e. 9 = ~ gi Xc. where gi ;?: 0 are constants i=l
and Ci are disjoint measurable sets with U Ci = M. Indeed, in i
this case
e t(EJL(g IA) = t( ~giJ.L(Ci IA)
i=l
e since ~ J.L( Ci IA = 1 Wa.s. If 9 ;?: 0 is an arbitrary measurable
i=l
-37-
function. then 9 is the pointwise limit of an increasing sequence of
simple functions. and (1.3) will follow from the monotone conver
gence theorem .•
Corollary 1.1. If ~ = IA 1 • ... • Ad is a partition of M and
.AcBis a a-field then HJ.L(~I...4) ~ log k. and HJ.L.(~I..J/) = log k only
when }L(At 1...4) =! }L-a.s. for all i.
Proof. Put ai =! and xi = }L(Ai 1...4). 1 ~ i ~ k then {l.2}
implies the assertion .•
The entropy has the following properties.
Lemma 1.2. If ~ = (B1 •... • B,d and 11 = (C1 •...• CeJ are
finite partitions of M and.A c B is a a-field then
(i) HJ.L.(~ v 111...4) = HJ.L(~ 1...4) + HJ.L.(11I:J(~) v .A'! ,
(ii) HJ.L.(~ v 11) = HJ.L(f) + HJ.L.(11I:A:f)),
(iii) f -< 11 implies HJ.L.{t 1...4) ~ HJ.L.(11 1...4).
(iv) ~ -< 11 ul implies HJ.L(tJ ~ HJ.L.(11).
-(v) If .A c.Ais another a-field then
(vi) HJ.L.(~) ~ HJ.L.(~I.A).
(vii) HJ.L.(fv71l.A'! ~ HJ.L(f 1...4) + HJ.L.(71 1...4)·
(viii) HJ.L.(~v71) ~ HJ.L.(~) + HJ.L.(71).
(ix) If rp : M 4 M preserves the measures }L i.e.
-38-
( 1.4)
then
Proof. First, remark that (ii), (iv), (vi), (viii) and (ix) will fol
low from (i), (iii), (v), (vii) and (ix) by taking the trivial a-field ~
! q"M! in place of .A or.A. We may assume without loss of general-
ity that all sets in the partitions ~ and TJ have positive J.L-measure.
(i) Notice that for each Cj E TJ,
where we take ~ = O. Indeed, both sides of the equality, in ques
tion, are :J(~) v ..kmeasurable and for any B E ~ and A E.A one has
J (" J.L(BinCj I.A) )d - J J.L(BnCj I.A) d L.J Xs. J.L - XB ---, -- J.L
An B i J.L(Bi I.A) A J.L(B I.A)
=J.L(A n B n Cj ) = J XCj dJ.L AnB
and (1.5) follows. Now by (1.5),
proving (i).
-39-
=- J I; J1-( Cj 1~~k.4)log J1-( Cj 1:7{~)vAd J1-j
=- J I; J1-(Bi n Cj IA log J1-(Bi nCj IAdJ1-i,j
(iii) When ~ v 1] = 1] then by (i) and Remark 1.1,
(v) By (1.3) we have
HIJ-(~ IA =- J I; ~(J1-(Bi IA)dJ1-i
=- J I; EJJ.(~(J1-(Bi IA) IAdJ1-i
~- J I; ~(EJJ.(J1-(Bi IA IA)dJ1-i
- -=- J I; L(J1-(Bi IA)dJ1- = HJJ.(~ IA·
i
-40-
(vii) Follows from (i) and (v).
(ix) It is easy to see that
for any Q E 11. Indeed, both sides of the equality are rp-1.-k
measurable functions and for any A E..A
J J.L(rp-1Q Irp-l..A)dJ.L = J.L(rp-l(Q n A)) tp-'A
since rp preserves the measure J.L. The proof is complete .•
Now we can prove
Theorem 1.1. Let rp : M ~ M preserves a measure J.L E AM).
If..A c B is a a-field satisfying
(l.B)
Then for any finite partition ~ of M there exists
n-l
h;1(rp,l;) = lim 1:... HIL(\/ rp-il;I..A). n_GD n i=O
(l.7)
This limit is called the entropy of rp with respect to ~ given..A.
n-l
Proof. If an = HILC\/ rp-i~I..A) then by the assertions (v), ~=O
(vii) and (ix) of Lemma 1.2 one has
-4-1-
n+m-l n-l
= Hp.( \/ ~-i~I.A) = Hp.(\J ttI-i~I.A) i=O t=O
m-l
+ Hp.(ttI--1't \/ ttI-i~I.A) i=O
Hence an' n = 1,2, ... is a subadditive sequence and so by the well
known argument which we shall recall here, lim 1.. an exists. n--+OD n
a Indeed, put a = inf .2'_. Given c > 0 choose k (c) such that
n;;,:l n
ak(e) < b d "k(;;) - a + c. Each integer n ~ 1 can e represente as
n = qk(c) + r for non-negative integers q and r";;; k(c) - 1. Then
by the subadditivity,
Letting n 4 00 and then taking into account that c > 0 is arbitrary
we conclude that
an lim sup -,,;;; a
n-+oo n
which together with the definition of a gives lim ~ = a .• n~- n
The final stage of the introduction of the entropy is the follow-
ing
Definition 1.3. If ttl: M -+ M preserves p. E: AM) then the
-42-
number h(rp) = sup h(rp,~), where the supremum is taken over
all finite partitions of M, is called the entropy of rp given a a-field
.A satisfying (1.6). If.A is the trivial a-field we omit .A and write
h~(rp,f) and h~(rp). These numbers are called the entropy of rp with
respect to ~ and the entropy of rp, respectively.
Next we shall discuss main properties of the entropy.
Lemma 1.3. Suppose that ~ and T/ are finite partitions of M,
rp : M -+ M preserves a measure J.L E AM) and .A c f] is a a-field
satisfying (1.6). Then
(i) hd(rp,~) ~ H~(~ IA)·
(ii) h;!(rp,~v.,,) ~ h(rp,~) + h;!(rp,T/).
(iii) f -< T/ implies h;!(rp,~) ~ h;!(rp,T/).
(iv)
~ hd(rp,.,,) + H~(~I:)T/)).
(v) h(rp,rp-1~) ~ h;!(rp,T/).
.\:-1
(vi) If k ~ 1 then h;!(rp,f) = h;!(rp,';!o ",-in
Proof. (i) By (1.6) and Lemma 1.2 (v), (vii) and (ix),
(ii) By Lemma 1.2 (vii),
-43-
n-l n-l n-l
Hp.(\j rp~(~VTJ) IA = Hp.«\/ rp~~) v (\/ rp~TJ) IA ~-o ~=O ~=O
(iii) If ~ -< TJ then
n-l n-l
\/ rp---i ~ -< \/ rp~TJ i=O i=O
and so by Lemma 1.2 (iii) the assertion follows.
(iv) By Lemma 1.2(i) and (v),
n-l n-l n-l
Hp.(\/ rp---i~I.A) ~ Hp.«\/ rp~~) v (\/ rp~TJ) 1--4) (1.8) i=O i=O i=O
n-l n-l n-l
= Hp.(\/ rp---iTJIA + Hp.(\/ rp---i~I:A:\/ rp~TJ)v--4). i=O i=O i=O
Next, by (1.6) and Lemma 1.2(v), (vii) and (ix),
n-l n~l
Hp.(\/ rp---i~I:A:\/ rp~TJ)vA i=O i=O
~ :E Hp.(rp~~I:A:rp~TJ)v.A) i=O
which together with (1.8) yields the assertion (iv).
-44-
(iv) By (1.6) and Lemma 1.2(v),
k n-l k
(vi) h/(rp,\/ rp-'!.~) = lim l Hp.(\/ rp-j(\/ rp-i~) IA ~=O n-+oo n j=O i=O
We can deduce from Lemma 1.3 the following important pro
perty of the entropy.
Lemma 1.4. Suppose that rp : M ..... M preserves JJ. E AM) and
..A c B is a a-field satisfying (l.6). then
h/(rpk) = kh/(rp) for any integer k > O. {l.9}
Proof. It is easy to see that
(1.10)
Indeed,
k-l k-l
lim l Hp.{\/ rp-kj(\/ rp-il;) IA n ... - n j=o i=O
-45-
Hence,
(1.11)
where each time the supremum is taken over finite partitions ~
and 7]. On the other hand, by (1.10) and Lemma 1.3(iii),
k-l
h j( rpk ,~) ~ h j( cpk';fa rp---i~) = k h j( rp,~)
and so, h (rpk) ~ k h (rp) which together with (1.11) proves (1. 9) .•
The calculation of entropy can be simplified if one uses the fol
lowing Kolmogorov-Sinai theorem. Given a a-field ..A we shall call a
finite partition ~ of M an ..kgenerator if the minimal a-field con-
taining both ..JJ and ~- = \/ rp---if coincides with B up to sets of J.l.i=O
measure zero.
Lemma 1.5. Suppose rp: M -+ M preserves J.l. E AM) and
..JJ c B is a a-field satisfying (1.6). If ~ is an ..kgenerator then
(1.12)
Proof. It suffices to show that for each finite partition 7J we
have
(l.13)
By Lemma 1.3 (iv) and (vi),
-48-
n-l n-l
h;1( rp,''1)~h:(rp, \/ rp-1.~) + H JJ.(7J I'J{\/ rp-1. ~)vA ~=O ~=O
(1.14)
n-l
= h;1(rp,~) + HJJ.(7J 1:A:'i!o rp-1.~) v A).
By the theorem about the convergence of conditional expecta
tions, which follows from the martingale convergence theorem
(see Neveu [37] or Martin and England [36]), for each C E 71,
n-l
J-L(CI'J{\/rp-1.~)v..A) ..... J-L(CI:A:f")vA J-L-a.s. i=O n-foao
where :A:~~) denotes the minimal a-field containing ~~. Thus,
n-l
lim HJJ.(7J I'J{\/ rp-1.~)vA = HJJ.(7J 1'J{~~)vA. n -+oa 1,=0
(1. 15)
But :J{~~)vA = B up to sets of J-L-measure zero and so
for any C E 71. Hence the right hand side of (1.15) is equal to zero.
This together with (1.14) gives (1.13) which yields (1.12) .•
The following result is also often useful for calculations of
entropy.
Lemma 1.6. Suppose, again, that rp : M ..... M is a measurable
map preserving J-L E AM) and A is a a-field satisfying (!..6). Let
~l -< ~2 -< . .. be an increasing sequence of finite partitions such
that the minimal a-field containing both A and all ~i' i = 1,2, ...
coincides with B up to sets of J-L-measure zero. Then
-47-
(1.16)
Proof. Let 1'/ be any finite measurable partition of M. By
Lemma 1.3 (iv),
( 1.17)
In the same way as in Lemma 1.5 the martingale convergence
theorem yields
By Lemma 1.3 (iii) the sequence h;!(rp,fn) n = 1.2, . .. is non
decreasing and so (1.17) and (1.18) imply
Since 1'/ is arbitrary (1.16) follows.·
Mter these introductory notes we can proceed to the discus
sion about possible definitions of the entropy for random transfor
mations. We shall use here the notations of Section 1.2. The
straightforward approach yields three choices. The first one is the
set up when we put M == M x 0 and rp == T where T is given by (1.2.4).
This entropy we shall denote by hpxp(T) where p E AM) is some
p. -invariant measure in the sense of Section 1.2. Another possibil
ity is to take M == 0 and rp == 1'J with 1'J given by (I.2.3). Then we
shall get the entropy hp(17). In both cases we take A in Der.nition
1.3 to be the trivial a-field.
To explain the third option take M to be the space of sequences
M- = h: 7 = (xO,x1' ... ). x;. E MI. The transformation rp becomes
the shift a acting by Xn (U7) = Xn +1(7) where X n (7) is the n-th term
-48-
in the sequence 7. Finally, we introduce on Moo a Markov measure
related to the Markov chain Xn considered in Section 1.2. If p is a
p. -invariant measure then the corresponding Markov measure p p
is defined first on the sets of the form
(1. 19)
by
( 1.20)
for any measurable subsets Gi C M. Then employing Ionescu
Tulcea's or Kolmogorov's extension theorems (see Neveu [37]) one
obtains p p defined already on all measurable subsets of Moo taken
with its product measurable structure. This gives us another
entropy hpp(a) which is viewed as the entropy of the Markov chain
Xn·
These three entropies satisfy
Lermna 1.7.
(1.21)
Proof. It is easy to see that hp(~) = sup hpxp(T,l;) where the
suprem urn is taken only over the finite partitions l; of M x 0 having
the form l; = fM x r l , ... ,M x rd with fr l , ... ,rd being a parti
tion of O. Hence the first inequality of (1.9) follows.
-49-
Next, consider the map 1/1: M x 0 ~ M- acting by 1/I(x ,w) = . .,
with xn (..,) = n f( w)x. It is easy to see that 1/1 is measurable and so if
S is a finite partition of M- then 1/I-ls is a finite partition of M x O.
Moreover, it follows from the definitions that
Taking the supremum over S we obtain the second inequality in
(1.21) .•
Remark 1.2. Actually, in order to get hpp(u) it suffices to take
sup hp,(u,{,) over all finite partitions of M- into the sets
1T( Gio ' ... ,Gi ,,) defined by (1.7) where ! Go, ... , Gk l forms a parti
tion of M.
The straightforward entropies which we have considered above
are not very convenient for analysis of random transformations
since they are too big, namely, in many interesting cases they are
equal to infinity.
Theorem 1.2. Suppose that all transition probabilities P(x,')
given by (J 2.6) have bounded densities p (x ,y) ~ K < CX) with
respect to some measure m E AM) i.e., P(x, G) = J p (x ,y )dm (y) G
for any measurable GeM. Assume that for any n ~ 1 there exists
a partition tn = !Afn ), ... ,A£n)! such that m(Ai(n)) ~ 1.... for all n
i = 1, ... ,kn . Then hpxp(T) = h p (t7) = hp/u) = "".
Proof. Consider a family of partitions 7111. = ! Qfn ), ... , QJ..n)!
of 0 such that Qin ) = (c.J : f\(w)x E 14(n)! where x EM is a fixed
point. Then t7-j Q/n) = !w : fj+1(c.J)x E 14(n)!. Since f\,fz,'" are
independent and have the same distribution tn then
-50-
e-l l H (\/1'J-j T/ ) ~-log max p( Q.(n)) = e P j=O n i ~
=- log max P(x, ,dn )) ~- log K max m (At(n)) ~ log n. i ~ i K
This together with (1.21) yield hpxp(T) ~ hp(1'J) ~ log ;. Letting
n -> 00 we obtain that two out of three entropies are infinite.
Next, take the partitions (n = (rin ), ... ,rt)) of M- such that
rin) = h' : Xo(7) E At(n)!. Then
= ~ i o . .... ie-l
~ -e log K max m(Ai(n)) ~ e log n i K
where Pp is defined by (1.8) and we have used that
p(G) = J dp(x)P(x,G) ~ Km(G). By the definition of hp,(a) this
implies that hp/a) ~ log ;. Letting n -> DO we obtain hp/a) = 00
completing the proof of Theorem 1.2 .•
Remark 1.3. The assumptions of Theorem 1.2 will be
satisfied, for instance, in the case of stochastic ft.ows considered in
Chapter V. Then m will be the Riemannian volume on a I':lanifold
M.
Theorem 1.2 justifies the need for another definition of entropy
of random transformations. Let, again, f 1.f2 , . .. be independent
random maps of M with the same distribution tn.
-51-
Definition 1.5. We shall say that p E: AM) is f-invariant if
p(t1G) = p(G) for m-almost all I (1.10)
and every measurable GeM.
Let <p: Jf' x ... xJf' -> IR be a function. We shall write
<P(f1' ... ,fn) to denote a function on 0 such that
<P(f1' ... ,fn)(CJ) = <p(f1(CJ), ... ,fn (CJ)). For instance, if ~ is a finite n-1
partition of M then Hp()!a if-1~) means a function on 0 taking on
n-1
the value Hp(\./ (if(CJ))-l~) for each CJ E: O. i=O
Theorem 1.3. Suppose that p E: AM) is p' -invariant and ~ is
a finite partition of M. Then there exists
n-1
hp(f,~) = lim l.. JHp(\J if-1~)dp n ..... n ~=O
(1.22)
where, again, if = fo ... of 1 and 0 f = id. If p is f-invariant in the
sense that
p(f -1 G) = p( G) for m-almost all f (1.23)
and every measurable GeM then
n-1
hp(f,~) = lim l..Hp(\./ir1~) p-a.s. n ..... n i=O
(1.24)
Proof. First we shall prove (1.22) directly but later on we
shall see that it is a partial case of Theorem 1.1. Put n-l
an = Hp(\./ ir1~) then by Lemma 1.2 (viii), i=O
-52-
n+m-1 an +m ~ ~ + Hp(nf-1(~v \/ (fio ... ofn+1)-1~)).
i=n+1 (1.25)
Denote the last term in the right hand side of (1.25) by Cn .m . Set n+m-l
~n m = ~ v \/ (fi o ... ofn+l)-l~ then . i=n+l
J Cn .m dp
=-J ... J(J ... J ~ ~(p((fno ... of1)-lA))dm(f1)···dm(fn)) AE~n.'"
dm(f n+1) ... dm(f n+m-1)
~-J..J ~ ~(J ... J p(f nO· .. of l)-lA )dm(f 1) ... dm(f n)) AE~n.'"
dm(f n+l) ... dm(f n+m-1)
where the last inequality follows from Lemma 1.1. Since p is p'
invariant then
Therefore
7n-1
J Cn.md~J Hp(~n.m)dp= J Hp(}!o ir-1~)dp= J amdp.
Thus by (1.25) the numbers bn = Jandp satisfy bn+m~bn·tbm i.e.
the sequence fbn l is subadditive. In the same way as in Theorem
1.1 we conclude from here that the limit (1.22) exists. If p is f
invariant then by Lemma 1.2(x).
-53-
n+m-J
Cn,m = Hp(~v .\/ (ft o ... ofn+l)-l~) = am 0 'dn ~=n+l
and so an +m ~ an + am 0 'dn . Therefore fan I is a subadditive pro
cess, whence the application of Theorem 1.2.2 implies (1.24) .•
Definition 1.4. If p E P(M) is P-invariant then the number
hp(f) = sup hp(f,~), where the supremum is taken over all finite
partitions of M is called the entropy of a random transformation f,
having the distribution tn, with respect to the measure p.
Remark 1.4. It is clear from (1.22) and Definition 1.4 that
both hp(f,~) and hp(f) depend only on the distribution m and not on
a specific random transformation f. The reason for our notation
hp(f) instead of, probably, hp(tn) is just to comply with the notation
of entropy for deterministic transformations.
Remark 1.5. The number hp(f) coincides with the "mixed" or
"relative" entropy of the skew-product transformation T which was
introduced by Abramov and Rohlin [1] (see also Ledrappier and
Walters [32]). This entropy was considered in [1] as an auxiliary
quantity which one has to add to the entropy of the base transfor
mation to obtain the entropy of the skew-product transformation.
If p E AM) is f-invariant this yields in our case that
hpxp(T) = hp(f) + hp('d). In simple cases one can compute the
entropy directly from Definition 1.6.
Example 1.1. Suppose that rp : M -+ M is a non-random map
preserving some measure p E AM). Let m be a two-point distribu
tion with the weight p > 0 at rp and the weight q = 1 - P dt the
identity map id. Now let fl,f2' . .. be independent random maps
with the distribution m i.e., we take at random rp or id with proba
bilities p and q, respectively. If ~ is a partition of M then
-54-
n-l k(.,.n)
\/ (if(w))-lf = \/ q;-if i=O i=O
where k (w.n) is the number of j -s such that fj (w) = q;. By the law
of large numbers almost surely
and so almost surely
1 -k(w.n) -+ p as n -+ 00
n
n-l
hp(f.f) = lim 1- Hp(\/ (if(w))-lf) = n-uon i=O
=lim k(w.n) n-uo n
Hence hp(f) = php(rp) where hp(rp.f) and hp(rp) denote usual entro
pies of rp.
Remark 1.6. It is easy to see that
( 1.26)
Indeed. if f = IA l •... • Akl is a partition of M and f = if l •··· .fkl with fi = 17: xo{)·) E Ad is a partition of MOO then by Lemma 1.1.
n-l
J Hp(}fo «if(w))-lf)dp(w)
-55-
which implies (1.26). Using certain relations between hp(f) and the
topological entropy we shall see later that in some cases the left
hand side of (1.26) may be finite while the right hand side equals
infinity.
Remark 1.7. Usually, one expects from the entropy to be
invariant under an isomorphism. If ip: M 4 M is a non-random
one-to-one map such that both ip and ip-l preserve p E: AM) then it
follows from Definition 1.6 that hp(f) = hp(ipfip-l).
The definitions of the entropies hp(f,~) and hp(f) are quite
natural but it is not easy to derive main properties of entropies
directly from these definitions. It turns out that both hp(f,~) and
hp(f) can be considered as partial cases of entropies h-;!(ip,T/) and
h-;!(ip) studied at the beginning of this section, provided A,j.L and T/
are suitably chosen. This will enable us to get certain properties
of hp(f) as a consequence from corresponding properties of h-;!(ip).
Let BM be the a-field of measurable subsets of M and Bo be the
minimal a-field of subsets of ° = .,-- containing all sets of the form
fw: fl(w) E: «PI, ... ,fe(w) E: cflel for any sequence of measurable
subsets «Pi elf', i = 1, ... ,e.. Now we are going to apply the theory
from the beginning of this section to the case when
M = MxO, B = BMxBo, j.L = pxp, ip = T and A = MxBo
where p E: AM) is a p' -invariant measure, Bu.xBo is the minimal
a-field containing all product sets G x r == f(x ,w) : x E: G, wE: r! where G E: Bu., r E: Bo, and M x Bo denotes the a-field of all product
sets of the form M x r with r E: Bo.
Theorem 1.4. Let p E: AM) be a p' -invariant measure. Then
(i) If ~ = ! Gi , ... ,Gk ! and ~ = frl , ... .re! are finite measur
abLe partitions of M and 0, respectiveLy, then
-56-
( 1.27)
where ~x(= !Gixfj , i = 1, ... ,k; j = 1, ... ,Q!.
(ii) ( 1.28)
Proof. Since p is p' -invariant then according to Lemma 1.2.3
the skew product transformation T preserves pXp. Besides, clearly,
T-1(yxBo) c M x Bo and so the right hand sides of (1.27) and (1.28)
are well defined.
(i) We have
n-l pxp( n T-r ( Gi,. xfjr ) I MxBo) =
T=O
n-l = pxp( n «x,w): Tf(w)XEGi,. and ,jTWEfi,.IMxBo)
T=O
since all sets !w: ,jTW E fd belong to Bo. Now computing the n-l
entropy of the partition \/ T-i(~X() by means of the above condii=O
tional probabilities according to Definition 1.2 we obtain
n-l n-l
Hpxp(\/ T-i(~X() I MxBo) = J Hp(\/ irl~)dp t=O t=O
which implies (1.2).
(ii) From (1.28) it follows
-57-
(1.29)
where the supremum is taken over all finite partitions ~ and ( of M
and 0, respectively. By the definition of the entropy one can
choose a sequence of finite partitions TJn of M x 0 such that
h IlXEO( ) - l' hllXJ?O( ) pXp T - 1m pXp T,TJn'
n-o ... ( 1.30)
Remark that finite unions of disjoint product sets G x r, G EO BII., rEO Bo form an algebra Allxo which generates the whole a-field
BM x Bo. If TJn = f Qfn ) , ... , Q~n)! then given ~ > 0 one can choose
some sets Ri(n) EO Axo such that
(1.31)
where 11 denotes the symmetric difference of two sets. Since Qin ),
i = 1, ... ,kn are disjoint then pxp(I4(n) n Rr») < 2~kn-6 if i T- j.
Take N = U (I4(n) nR}nl) then pxp(N) < ~kn-4. Set 'l4(n) = I4(nl \ N, i""j
Ien-l i = 1, ... ,kn -1 and RIc~n) = (MxO)\ U 'l4(n). Then
i=l
fin = fRfn) , ... ,RJ..n)! is a finite partition of M x 0, 'l4(n) EO Axo for
all i = 1, ... ,kn and
(Q (n) A '/l(nl) < 2 k-3 . - 1 k pxp i L1 ''i ~ n ,7- - , ... , n' ( 1.32)
Let In be the partition of M x 0 into the sets Qi(n) n Rr), i T- j,
and (j (Qin ) n J?,;(n»). Then pxp( Q/n) n 'R}n») < 2~kn-3 if i T- j and i=l
pxp( ~ (Qin ) n J?,;(n»)) > 1-2~kn-2. Hence i=l
( 1.33)
-56-
provided l: is small enough. Denote the right hand side of (1.33) by
6n (l:) and put 6(l:) = sup 6n (l:). Then clearly t5(l:) ..... 0 as l: ..... o. Now n",l
by Lemma 1.2 (ii),
and so
(1.34)
Thus by Lemma 1.3 (iv),
(1.35)
But each element of fJn is a finite union of product sets, whence
there exists finite partitions ~n and (n of M and n, respectively,
such that fJn "< ~n x~n. By Lemma 1.3 this gives
( 1.36)
Since by (1.27),
-59-
we conclude from (1.30), (1.35) and (1.36) that
( 1.37)
Since e is arbitrary and c5(e) -+ 0 as e -+ 0 then (1.37) gives
which together with (1.29) imply (1.28) .•
Remark LB. If 11 is a metrisable separable space or measur
ably isomorphic to such space up to sets of p-measure zero, then
one can choose an increasing sequence of finite measurable parti-
tions tn of 11 such that \/ tn generates the whole a-field BJI. of n=l
measurable subsets of 11. Then
... (~\/ tn)xO)v(1IxBo) = OJI. X Bo
n=l
and so by Lemma 1.6,
(1.38)
where for a collection subsets.:7c BJI. we put.:Jx 0 = HG,O),G E:A. Now (1.27), (1.29) and (1.38) imply (1.28).
Corollary 1.2. Let f 1,f2,··· be independent random
transformations of 11 with the same distribu.tion m and p E AM) be
a p' -invariant measu.re. Then
-60-
(i) For any integer k > 0,
(1.39)
(ii) If ~ is a finite partition of M and :J{\/ if-l(c.J)~) for p-i=O
almost all c.J coincides up to sets of p-measure zero with the a-field
By of all measurable subsets of M then hp(f) = hp(f,n
(iii) If ~ 1 -< ~2 -< ... is an increasing sequence of finite parti-
tions such that :J...\/ ~i) coincides with By then i=l
By means of Theorem 1.4 the assertions (i) - (iii) follow
immediately from Lemmas 1.4 - 1.6, respectively, and we leave the
details to the reader.
Remark 1.9. It was P. Walters' idea to prove (i) by means of a
more general result concerning conditional entropies. This was the
main reason for the long introduction on conditional entropies at
the beginning of this section. It would be interesting to find a
direct proof of Corollary 1.2 without referring to the general
theory of conditional entropies.
B. Weiss suggested that. for any good nolion of entropy one
should be able to prove some kind of Shannon-McMillan- Breiman
theorem. To do this we must introduce first the information func
tion. From now on we shall consider the conditional information
and entropy with respect to finite partitions only.
Definition 1.5. Let ~= !A 1,··· ,Ad and (= !C1,.·. ,Cn! be
finite partitions of M and p E AM). Then the function
-61-
(l.40)
is called the conditional information of ~ given {. If { = frp,Ml is the
trivial partition then we get the information function [pet) of t.
In what follows we denote Hp(tl() = Hp(~I~()). Clearly,
(1.41)
First we shall need the following result (see Martin and England
[36J Lemma 2.17 and Theorem 2.18).
Lemma 1.8. If t is a finite partition of M and !(n I is an
increasing sequence of finite partitions then
(1.42)
and lim [p(~I(n) exists bothp-a.s. and in (bl(M,p) sense. By (1.41), n-+-
also lim Hp(t 1 (n) exists. n-+-
Proof. The second part of this assertion follows immediately
from (1.42) and the Martingale Convergence theorem (see Neveu
[37J, Ch. N.5 or Martin and England [36], Section 1.8).
To prove (1.42) put g = sup [p(~1 (n), and define n
G(a) = pfx : g (x) > a f. Then it is easy to see that
Jg dp = J G(a)da. (1.43) II 0
Furthermore,
-62-
= E E P(Ai n c(n)), i n
where At are elements of ~, Cpc) are elements of hand c(n) = !x :
if x EAt n CHc) for k = 1, ... ,n thenp(At n Cj~n))/p(Cj~n)) < e-a
but P(Ai n Cj~k))/p(Cj~k)) ~ e-(J. for all k < nl. Since c(n) are dis
joint it follows from above that
C(a) ~ E min(p(At),e-a ). i
This together wi th (1.43) yi.eld
J g dp = J C(a)da ~ E J min(p(At),e-a.)da II 0 i 0
-logp(Ad
= ~( J p(At)da + J e-ada) = i 0 -logp(Ad
= E( -p(At)log p(At) + p(At)) = Hp(~) + 1 i
proving Lemma 1.8 .•
Lemma 1.9. If ~ is a finite partition and p is f-invariant then
n hp(f,~) = lim Hp(~ I \j irl~) p-a.s.
n ..... - ]=1 (1.44)
Proof. By the assertions (i) and (ix) of Lemma 1.2,
-63-
n-l Hp(}!a ir-l~) = Hp(n-lf-l~)
n-l + H (~I\/jf-l~) =
P j=l
Dividing this by n and taking into account (1.24), Lemma 1.2(vi)
and the assertion about convergence of conditional entropies from
Lemma 1.8 one obtains (1.18) .•
Now we are able to prove
Theorem
theorem).
1.5. (Random Shannon- McMillan- Breiman
Let P E AM) be f-invariant and ergodic i. e., Pg = g p-a.s.
implies g = const p-a.s. If ~ is a finite partition then
n-l lim l I (\/ if-l~) = h (f,~) n-+- n p i=O P
p x p-a.s. (1.45)
Remark 1.10. This theorem remains true with the same
proof for any measurable partition ~. If one does not assume ergo
dicity of p then the same arguments lead to the assertion where
the limit in (1.45) depends on x but its integral with respect to p
equals hp(f,€).
-64-
Proof of Theorem 1.5. If ~O'~l' ... '~n-l are finite partitions
of M then it follows from Definition 1.5 that
n-l n-l n-l fp(\./ ~d = fp(~o I \/ ~d + fp(\/ ~i) =
i=O i=l i=l ( 1.46)
n-l n-l = fp(~oIYt ~i) + fp(~ll~~i) + ... + fp(~n-l)'
In particular, for ~i = (if(w))-l~, using the fact that p is f
invariant, we have p x p-a.s.,
n-1 n-l f (\/ if-1~) = f (~I \/ ir-l~) p i=O P i=l
(1.47)
n-1 n-2 = f (~I \/ ir-l~) + f (~I \/ (if a ~)-1~) a f1
p i=l P i=l
= Tn-l + Tn-2 a T + ... + TO a Tn - 1
where the second identity is a consequence of
( 1.48)
for any two finite partitions of M and
k
Tk(x,W) = fp(~I\/ (if(w))-l~)(x), TO(X) = fp(~)(x). (1.49) i=l
-65-
By Lemma 1.8 for each w there exists p-a.s. and in (b!(M,p)-
sense
r(x,w) := lim r.,(x,w). ., ... ~ ( 1.50)
But r.,(x ,w) ~ 0 and by (1.41),
., f r., (x ,w )dp(x) = Hp(~ I \/ (if(c.J))-!~) ~ HpW II t=!
(1.51)
where the last inequality follows from Lemma 1.2(vi). Hence
., f r(x,c.J)dp(x) = lim Hp(~I\/ (if(w))-!~) ~ Hp(~) II ., ... ~ t=!
( 1.52)
and so r E (bl(M x G,p x p). Thus we can apply the ergodic theorem
(Corollary 1.2.2) to conclude that p x p-almost surely there exists
if" = lim 1.... t r (r" (x, w)) n ... '" n .,=0
( 1.53)
and
., if" = lim f Hp(~ I \/ irl~)d p .
., ...... 0 t=l
Then by Lemma 1.9,
( 1.54)
From (1.47), clearly,
-66-
I n-l I .l. I (\/ if-1{) - r n P i=O
(1.55)
+ - :E T a Tt - if . 11 n-l . 1
n i=O
Since by (1.53) the latter term converges to zero p x p-a.s. it
remains to show only that
1 n . limsup - :E gn~ oTt = 0 P x p-a.s.
n-+oo n i=O ( 1.56)
From (1.50) - (1.52) we know that gJc ~ 0 as k ~ 00 P x p-a.s. and
in (bl(M x O,p x p) sense. Consequently if GN = sup gn then n>?V
GN .\. 0 as N t 00 P x p-a.s. ( 1.57)
and
(1.58)
where sup TJc E (bl(M x O,p x p) by (1.42). Jc
For N < n,
-67-
1 n . + - L; gn-i a T~ ~
n i=n-N
1 n-N-l . 1 n . ~ - L; GN a T~ + - L; GO a T~
n i=O n i=n-N
Using (1.58) and Corollary 1.2.2 we conclude from here that
1 n . ~ J limsup - L; gn-i a T~ ~ GN = GN dp X P
n -+00 n i=O p X p-a.s. since p is
ergodic. This together with (1.57) and (1.58) give (1.56) and com
plete the proof of Theorem 1.5 .•
2.2 Topological entropy.
In this section we introduce the notion of the topological
entropy for random transformations. Our exposition follows the
lines of the deterministic theory from Walters [46].
Throughout of this section M will be a compact topological
space and m will be a probability measure on the space (M.M) of
continuous maps of M into itself. We shall start with the following
defmitions.
Definition 2.1. Let a and f3 be some covers of M by open sets.
The joint a v f3 is the open cover by all sets of the form A n B n
where A E a. B E f3. By induction we define the joint \j ai of any i=l
finite collection of open covers of M.
Definition 2.2. An open cover f3 is a refinement of an open
cover a. written a -< f3. if every member of f3 is a subset of a
member of a.
Definition 2.3. If a is an open cover of M and rp E [(M.M) then
-68-
rp-1a is the open cover consisting of all sets rp-1A where A E a.
We have rp-1(av{3) = rp-1(a) v rp-1({3). and 0.-< (3 implies
rp-1a -< rp-1{3.
Definition 2.4. If a is an open cover of M let N(a) denote the
number of sets in a finite subcover of a with smallest cardinality.
We define the entropy of a by fi(a) = log N(a).
Lemma 2.1. Let a and (3 be open covers of M then
(i) fi(a) ~ 0 and fi(a) = 0 iff N(a) = 1 iff MEa;
(ii) If a -< (3 then flea) ~ fi({3);
(iii) fi(av{3) ~ fi(a) + fi({3);
(iv) If rp E IC(M.M) then fi(rp-1a) ~ fi(a). If rp is also surjective
then fi(rp-1a) = fi(a).
Proof. The assertions (i) and (ii) are obvious. To show (iii)
assume that !A 1 .... ,AN(a)! is a subcover of a of minimal cardinal
ity, and !B1 • ... ,BN(fn! be a subcover of (3 of minimal cardinality
then !~ n Bj • 1 ~ i ~ N(a), 1 ~ j ~ N({3)! is a subcover of a v (3.
Hence H(av{3) ~ N(a)N({3). This proves (iii). Next, if fA 1 •... • AN(a)l
is a subcover of a of minimal cardinality and rp E IC(M.M) then
f rp-1A l' ...• rp-1 AN(a)! is a subcover of rp-1a and so N(rp-1a) ~ N( a).
If rp is surjective and !rp-1A 1, ... ,rp-1AN(qI-'a)J is a subcover of rp-1a
of minimal cardinality then !A l' ... ,AN(qI-'a)! also covers M. Hence
N(a) ~ N(rp-1a) .•
Now, we can prove
Theorem 2.1. Let f 1.f2, ... be independent random maps of
M with the same distribution ttl on IC(M,M). If a is an open cover of
M then there exists a non-random limit
-69-
n-l
h(f.a) = lim l.JJ(\/ if-la) n -+00 n i=O
p-a.s. (2.1)
This limit is independent of a choice of random transformations
f 1.f2 •... and depends only on their distribution m.
Proof. Put
then by Lemma 2.1 (iii) and (iv).
n+m-l
b =.JJ( \/ ir1a) n+m i=O
(2.2)
n+m-l
\/ i=n+l
But b l(w) = log N(a) and so (2.2) enables us to employ
Theorem I.2.2. which yields (2.1). Since
(2.3)
n-l
~ J ... J .JJ(av'i!t (ft.o ... o/t)-la)dm(/t) ... dm(ln-l)
converges to the same limit then it follows that h(f.a) depends
only on ttl but not on a choice of f1.f2 •....•
Remark 2.1. (i) h(f.a) ~ 0;
(ii) h(f.a) ~ J.(f.{J) provided a -< {J;
-70-
(iii) l(f,a) ~ .JJ(a) since n-1 n-1
.JJ(\/ if-1 a ) ~ 1: .JJ(if-1a ) ~ n.JJ(a). i=O i=O
Definition 2.5. The number l(f) = sup l(f,a), where a ranges a
over all open covers of M, is called the topological entropy of any
random transformation f having the distribution m.
Remark 2.2. (i) In the definition of l(f) one can take the
supremum over finite open covers of M;
(U) l(id) = o.
Lemma 2.2. The topological entropy has the following pro
perties
(i) if ffJ is a non-random homeomorphism of M then
l(f) = l( ffJfffJ-1);
(ii) if p-almost surely f(",) is a homeomorphism then
k(f) = h(f-1).
Proof. By Lemma 2.1 (iv),
n-1
h(ffJfffJ-1,a) = lim 1... .JJ(\/ ( ffJifffJ-1)-1 a ) n--- n i=O (2.4)
n-1 = lim 1... fie\/ if-lql-1cx) = ~(f,qI-ICX).
n .. - n i=O
If a ranges over all open covers then ffJ-1a also ranges over all open
covers since qI is a homeomorphism. Thus (i) follows.
By Lemma 2.1 (iv) and Theorem 2.1,
-71-
n-l
h(f,a) = lim 1... f J}(\/ if-1a)dp n--.. n i=O
(2.5)
n-l
= lim .1 f J}(n-lf \/ ir1a)dp n -+00 n i=O
n-l
= lim .1 f ... f J}(av\/ A a ... a Aa)dm(/I) ... dm(M n-+oo n t=l
n-l = lim 1... fJ}(av\/ Wool a ... a fl1)-la)dp
n-+oa n i=l 1.
= h(f-1,a)
which implies (ii) .•
Next, we shall give another definition of topological entropy
which is often more convenient especially for calculations.
In the remaining part of this section we shall assume that M is
a compact metric space with a metric d. We shall use the metrics
d;{(x,y) = max d(kf(CoJ)x,kf(CoJ)Y) introduced in Section 1.3. Osksn-l
Definition 2.6. A subset Fe M is said to be (CoJ,n,l:)-span M if
for any x EM there is y E F with d;{(x,y) ~ l:. By r"'(n,l:) we
denote the smallest cardinality of any (CoJ,n,l:)-spanning set.
Definition 2.7. A subset E c M is said to be (CoJ,n,l:)-separated
if x,Y E E, x ~ y implies d;{(x,y) > 1:. By s"'(n,l:) we denote the
largest cardinality of any (CoJ,n ,l:)-separated subset of M.
We shall need
Lemma 2.3. (Lebesgue Covering Lemma) If (M ,d) is a com-
-72-
pact metric space and cx is an open cover of M then there exists
o > 0 such that each subset of M of diameter less or equal to 0 lies
in some member of cx. (Such a 0 is called a Lebesgue number for cx).
Proof. Let cx = !A l , ... ,Akl. Assume that the statement is
false. Then there exists a sequence xn such that all balls
B(xn , 1....) =!y : d(y,xn ) ~ 1....1 are not contained in elements of cx. n n
Taking a converging subsequence xn; --> x we conclude that no
neighborhood of x can be contained in an element of cx. This is a
contradiction since cx is an open cover .•
Lermna 2.4. (i) If cx is an open cover of M with the Lebesgue
number 0 then
n-l . 0 0 fi(\/ (~f(c.J))-lcx) ~ log r"'(n, -) ~ log s"'(n,-); (2.6)
i=O 2 2
(ii) If e > 0 and f3 is an open cover with
sup diam B "" diam P ~ e then BEfJ
n-l log r"'(n,e) ~ log s"'(n,e) ~ fi(\/ (if(c.J))-lp). (2.7)
i=O
Proof. Since any (c.J,n,e)-separated set of maximal cardinal
ity is an (c.J,n,e)-spanning set then r"'(n,e) ~ s"'(n,e) for all e > o.
To prove (i) assume that F is an (c.J,n, ~ )-spanning set of cardinal-
o n-l . . <5 ity r"'(n, -). Then M = U n (~f(c.J))-lB(~f(c.J)x, -) where, again,
2 xeFi=O 2
B(y,€) = !z : d (z ,y) ~ t:! is the e-ball centered at y. Since each . <5
B(~f(c.J)x, 2) is a subset of a member of cx then
n-l r"'(n,~) ~ N(\/ (if(c.J))-lcx) where N(P) was introduced in
2 i=O
Definition 2.4. This implies (i).
-73-
To get (ii) let E be an (w,n ,t:)-separated set of cardinality n-1
s"'(n,I:). No member of the cover }fa (if(w))-lp can contain two
elements of E so
n-1
s"'(n,e):;;; N(\/ (if(w))-lfJ) i=O
proving (ii) .•
Lemma 2.5. Let {an Jibe a sequence of open covers of M with
diam(an ) ... 00 as n ... DD. Then
lim h(f,an ) = h(f) provided h(f) < 00
n-+-(2.8)
and
lim h(f,an ) = 00 provided h(f) = "". n-+-
(2.9)
Proof. Suppose h(f) < "". Given I: > 0 choose an open cover fJ
with h(f,P) > h(f) - 1:. Let 0 be a Lebesgue number for p. Take no so
that n ~ no implies diam(an ) < O. Then p -< an and so
h(f,P) :;;; h(f,an ) by Remark 2.1(ii). Hence n ~ no implies
h(f) ~ h(f,an ) > h(f) - I: yielding (2.8). If h(f) = "" then for any a > 0
there exists an open cover "/ with h(f,,,/) > a. Now the same argu
ment as above shows that limh(f,an ) = "" .• n .....
Finally, we can obtain more direct formulas for the topolcgical
entropy.
Theorem 2.2. Let f be a continuous random map of a metric
space (M,d). Then p-almost surely
-74-
I.{f) = lim limsup 1- log r"'(n,l:) £-+0 11.-+00 n (2.10)
= lim liminf l- log r"'(n,l:) £-+0 n --+00 n
= lim limsup 1- log s"'(n,l:) £-+0 11.-+ 00 n
= lim liminf 1- log s"'(n,I:). £-+0 11. ..... CO n
Proof. Let a" be a cover of II by all open balls of radius 21:
and let (J" be any cover of II by open balls of radius 1:/2 then by
Lemma 2.4,
n-l n-l
fi(\/ (if(c,;))-la,,) ~ log r"(n,I:)~log s"(n ,I:)~fi(\/ (if(c,;))-l{J,,). i=O i=O
Hence
J.(f,a,,)~liminf 1- log r"(n ,I:)~liminf l- log s"'(n ,1:),.;;j.(f,{J,,) 11.-+(1) n 11. -+00 n
and
J.(f,a,,)~limsup l- log r"(n,I:)~limsup l- log s"(n,I:),.;;j.(f,{J,,) n-+oo n 11.-+00 n
Taking I: m -+ 0 along any subsequence and applying Lemma 2.5 we
obtain (2.10) .•
Remark 2.3. If In is concentrated on isometries of the metric
space (M,d) then the numbers r"(n,l:) and s"'(n ,I:) do not increase
in n and so in this case J.(f) = O.
-75-
Employing Theorem 2.2 we can get another property of the
topological entropy.
Lemma 2.6. For any integer n > 0,
I.(n f) = n I.(f)
where nf = fn a ... a fl and fl' ... ,fn are i.i.d. random transforma
tions.
Proof. Denote by r"(f,n,t:) the number introduced in
Definition 2.6 but now we do not assume that f is fixed i.e. we shall
consider r.,(nf,n,t:), as well. Then it is easy to see that
r.,(nf,k,t:) s:r"'(f,kn,t:). Dividing this by kn and letting k ~ co we
obtain by Theorem 2.2 that
On the other hand, for fixed nand W given t: > 0 there is 0 > 0 such
that d(x,y) <0 implies max d(jfx,ify) <t: since all transfor-O:!i):!in-l
mations fi (w) are uniformly continuous on the compact space M.
Hence an (w,k,a)-spanning set for nf is also a (w,kn,t:)-spanning
set for f and so r",(nf,k ,0) ~ r"'(f,kn ,t:). Dividing this by kn and let
ting k --+ co we get by Theorem 2.2 that
completing the proof. •
Theorem 2.2 enables us to estimate the topological entropy in
specific situations.
Theorem 2.3. Suppose that ttl is concentrated on homeomor-
-76-
phisms of the unit circle ~ such that there exists 1:0 > 0 with
d (x.y) < 1:0 implies d (flx .fly) < ~ (2.11)
for m-almost all I Then I.(f) = o.
Proof. We shall consider (c.>.k .I:)-spanning sets for I: < 1:0'
Clearly. rOl(l.l:) ~ [ 1:..] + 1. where [1:..] denotes the integer part of I: I:
1:... We shall see that rOl(n.l:) ~ n([ 1:..] + 1). I: £
Indeed. suppose that F is an (c.>.n-1.1:)-spanning set and E is a
minimal collection of points such that the distance between any
two neighbors is less than 1:. The cardinality of E is at most
[1:..] + 1. Then we claim that F' = F U (n-lf(c.>))-lE is an (c.>.n.I:)-I:
spanning set. To prove this take an arbitrary x E ~. Then there is
y E F with ct:-l (x.y) ~ 1:. We must find Z E F' such that
d':(x.z) 5; 1:.
If d(n-lf(c.»x.n-lf(c.»Y) 5; I: then we can take z =y. If this is
not true choose an interval 11 with end points n-lf(c.»x and
n-lf(c.»y which is mapped by f;~l(c.» to the interval 12 with end
points n-2f(c.»x and n-2f(c.»y whose length less than or equal to 1:.
Pick up a point z E F' with n-1f(c.»z E 11 and
d( ... -lf(c.»z .... -lf(c.»x) ~ 1:. Then it follows from (2.11) that 12 is
mapped by f;~2(c.» to an interval 13 with end points n-3f(c.»x and
n-3f(c.»y whose length is less than ~. Since ct:-l (x.y) 5; I: we see
that this length must be less or equal to 1:. By induction we con
clude that n-if(c.»z E 1;. for all i = 1 .... . n where It is the interval
with end points n-if(c.»x and n-if(c.»y whose length does not
exceed 1:. Hence d':(x.z) 5; I: and so F' is an (c.>.n.£)-spanning set.
Thus rOl(n.I:)-rOl(n-1.1:)~[1:..]+1 and so by induction. t:
-77-
r"'(n,l:) ~ n([ 1..] + 1). This together with (2.10) yieldh(f) = o .• l:
As another application of Theorem 2.2 we shall obtain an upper
bound for the topological entropy of smooth random transforma
tions. Suppose that M is a smooth v-dimensional compacl Rieman
nian manifold. Consider a probability measure m on lhe space of
smooth maps of M into itself and the corresponding sequence of
independent m-distributed random transformations f 1.f2' ....
Theorem 2.4. Let f be an m-distributed random smooth map
of a compact Riemannian manifold M then
where the norm of a differential n/was defined by (I 3. 15}.
Proof. Put a(w) = max(l,suP IIDf1(w)IL'). zEOli
flog a(w)dp(w) = aa there is nothing to prove. So assume
flog a(w)dp(eJ) < DO.
By the mean value theorem
{2.12}
If
(2.13)
(2.14)
where -tJ is the shift operator satisfying (1.2.3). It is easy to see that
there exists a constanl K > 0 such that for any 6 > 0 on.:! can
choose a set E(6) of at most K6-v points such that any point of M
lies in a ball of radius 6 centered at some point from E(6). Then,
clearly, E(6) is a (w,n,a(w)a(-tJw)··· a (-tJn - 2eJ)6)-spanning set.
Given t: > 0 put 6 = t:(a(w)a(-tJw) ... a (-tJ n - 2w))-1 then
-78-
Now by (2.10) and the Ergodic theorem (Corollary 1.2.2.),
/'(f) = lim limsup .!... log r"(n ,c) ~ 1:",0 n-foao n
1 n-2 . ~ IIlimsup - L: log a(~~w) = II J a(w)dp(w)
Tt-+oo n i=O
Remark 2.4. In Chapter V we shall see that in the case of sto
chastic flows generated by stochastic differential equations with
coefficients smooth enough the right hand side of (2.12) is always
finite and so the topological entropy in this case is finite, as well. I
Usually, the calculation of the topological entropy /.(f) is easier
than the metric entropy hp(f). So any relation between these two
entropies would be helpful.
Theorem 2.5. Suppose that an is a probability measure con
centrated on continuous maps of a compact metric space M con
sidered with its Borel measurable structure. If f is an m
distributed random transformation and p E AM) is p' -invariant
then
(2.15)
Proof. Let ~ = fA 1, ... ,Ak J be a finite partition of M. Choose
l: > 0 so that c < (k log k )-1. Since p is a probability measure on a
metric space then it is regular (see, for instance Walters [46],
Theorem 6.1) and so there exist compact sets Bj C Aj , 1 ~ j ~ k,
with p(Aj \. Bj ) < c. Let {' be the partition {' = IBo,B1, .. · ,Bd
-79-
k where Bo = M \ U Bj . We have p(Bo) < k F;, and by Corollary 1.1,
}=1
(2.16)
k p(BonAj) =- p(Bo) ~ L( --(B)--) :s: p(Bo)log k < k F; log k < l.
j=1 P 0
Notice that Bo U Bi = M \ U Bj is an open set provided i -F 0 j~i
and so f3 = fBo UBI' ... ,Bo U Bk! is an open cover of M. By Corol
lary 1.1 for any n ~ 1,
11-1 11-1
Hp(\/ (if(w))-I~") :s: log N(\/ (if(w))-I~") i=O i=O
(2.17)
11-1
where N(\/ (if(w))-I() denotes the number of non-empty sets in i=O
11-1
the partition \/ (if(w))-I(. It is not hard to understand that i=O
11-1 11-1
N(\/ (if(w))-I() ~ 211 N(\/ (if(w))-lp) i=O i=O
(2.18)
where the number N(a) was introduced in Definition 2.4 .. Now
(2.17) and (2.18) yield
11-1 11-1
lHp(\/ (if(w))-l():5: log 2 + l.JJ(\/ (if(c.»)-If3). n i=O n i=O
Hence
(2.19)
-80-
But
(2.20)
Indeed, by Lemma 1.2 (iv) and (ii),
11.-1
Hp(\/ (if(w))-l~) i=O
(2.21)
11.-1 11.-1
~ Hp((\/ (if(w))-l~) v (\/ (j f(w))-l{")) i=O j=O
11.-1 11.-1 11.-1
= Hp(\/ (1 f(W))-l{") + Hp(\/ (if(w))-l~ I \/ (if(w))-l{"). j=O i=O j=O
But by Lemma 1.2 (vii), (v) and (ix),
11.-1 11.-1
Hp(\/ (if(w))-l~I\/ (if(w))-l{") i=O j=O
(2.22)
11.-1. . ~ 2: Hp(('f(w))-l~Wf(w))-l{") = nHp(~I{").
i=O
Now (2.21) and (2.22) give
11.-1 11.-1(
Hp('i!o (if(w))-l~) ~ Hp('i!o if(w))-l{") + nHp(~I{") (2.23)
Dividing both parts of (2.23) by n and letting n ..... DO we shall
obtain (2.20). Now, (2.16), (2.19) and (2.20) give
-81-
Since ~ is an arbitrary finite partition then
(2.24)
By Corollary 1.2 (i) and Lemma 2.6, hp (1Lf) = nhp(f) and
l(1Lf) = nl(f). Hence, applying (2.24) to 1Lf in place of f one has
n ~(f) ~ nl(f) + log 2 + 1. Dividing this by n and letting n -> ex> we
obtain (2.15) .•
Remark 2.5. In general, the supremum of the left hand side
in (2.15) over all p. -invariant measures p may be less than the
right hand side. To get the equality (the variational principle) one
has to consider not only entropies h:xBu (T) with JJ. being a product
measure pXp but JJ. should be allowed to vary among all T-invariant
measures on M.xO whose projection on 0 coincides with p (see
Ledrappier and Walters [32]).
Remark 2.6. The inequality (2.15) is useful for evaluation of
metric entropies. In particular, under the conditions of Theorems
2.3 and 2.4 we obtain finite upper bounds for metric entropies. This
enables us to construct examples which we have mentioned in
Remark 1.6 i.e., when hp(f) is finite and the entropy ~,<(1) of the
corresponding Markov chain Xn is infinite. To do this, take, for
instance, an concentrated on rotations /" = e irp , rp E 1= [0,1] of the
unit circle EI and satisfying an(cfl) = mes cfl for any measurable sub
set cfl eEl where, recall, mes denotes the Lebesgue measure. If
x = e irpo E EI then
= mes (e -i'l'or) = mes f.
-82-
Thus we can employ Theorem 1.1 to conclude that hp,(u) = 00
where p '= mes. On the other hand, by Theorem 2.3 in our cir
cumstances, k(f) = 0 and so by Theorem 2.5, hp(f) = o.
2.3 Topological pressure.
In this section we shall introduce the notion of topological
pressure for random transformations. A comprehensive exposition
in the deterministic case can be found in Walters [46], Chapter 9.
Suppose that (M,d) is a compact metric space, t(M) is the
space of real-valued continuous functions and m is a probability
measure on the space t(M,M) of continuous maps of M into itself.
Again, we consider independent tn-distributed random transforma-
tions f 1.f2,·· . . For g E t(M) and n ~ 1 we denote n-1 . .
S:;g(x) == l: g(tf(w)x) where, recall, tf = fi a .. \. a fl. i=O
Definition 3.1. If g E r(M), n ~ 1 and a is an open cover of M
put n:(f,g,a) =inf! l: supes",gg(z)jp is a finite subcover of BElt zEB
n-1
\/ (if(w))-la l· i=O
Theorem 3.1. If g E r(M) and a is an open cover of M then
p-aLmost surely there exists a non-random limit
n(f,g ,a) = lim llog n:(f,g ,a). n ..... oo n
(3.1)
n-1
Proof. If p is a finite subcover of \J (if(w))-l a and 7 is a i=O
n+'\::-l ,\::-1
finite subcover of \/ (if(w))-la = \/ (if(~nw))-la then p v 7 is a i=n i=O
n+.\::-l finite subcover of \/ (if(w))-la. Therefore
i=O
-83-
and so
(3.2)
Thus cn (w) = log 7T:(f,g ,a) satisfies the subadditivity condition of
Theorem 1.2.2. Since Ic 1(w) I ~ suplg(x) I + log N(a), where, ZEM
recall, N(a) is the smallest possible cardinality of a finite subcover
of a, then the integrability condition of Theorem 1.2.2 is also true.
This yields the assertion of Theorem 3.1. •
Definition 3.2. The topological pressure 1T(f,g) of a ... -
distributed random transformation f for a function g E C(V) is
given by the formula 7T(f,g) = lim sup!1T(f,g ,a) I a is an open cover 6 ... 0
of V with diam(a) ~ oj.
Remark 3.1. From Definition 2.4, Theorem 2.1 and Lemma
2.5 it is easy to see that 1T(f,O,a) = h(f,a) and 1T(f,O) = h(f) i.e., the
topological entropy is the special case of the topological pressure
for g := o.
Next we shall give the definitions of the topological pressure by
means of spanning and separated sets.
Definition 3.3. For g E C(V) put
Q;:(f,g, t;) = inf! L e Sn-g(z) I F is a (w, n ,t;) (3.3) ZEF
-spanning set for VI;
-84-
R;:(f,g ,I;;) = SUp! L e .s;."g(z) / E is a (,""n ,I::) zEE
(3.4)
-separated setl;
Q"'(f,g ,I::) = Limsup l..log Q;:(f,g ,1::); n~ao n (3.5)
R"'(f,g ,I::) = Limsup l..log R;:(f,g ,1::). n ..... ao n (3.6)
Theorem 3.2. If g E C(M) then
n(f,g) = lim R"'(f,g,l::) = lim Q"'(f,g,l::) p-a.s. (3.7) E~O E~O
Proof. First, notice that Q"'(f,g ,I::) decreases and R"'(f,g ,I::)
increases in I:: so the limits in (3.2) exist. Next, it is easy to see
that
(3.8)
Since M is compact and g is continuous then for any I:: > 0 there
exists 6>0 such that d(x,y) <1::/2 implies /g(x)-g(y)/<6.
Then
(3.9)
Indeed, let E be an (w,n,I::)-separated set and F be an (w,n ,1::/ 2)
spanning set. Define qJ : E -+ F by choosing, for each x E E, some
point qJ(x) E F with d./:(x ,qJ(x)) ~ 1::/2 (using the rnetrics a: intro
duced in Section 1.3). Then
L e S:g(y);?: L e S:g(y)
yEF yErpE (3.10)
-85-
~ (min e.s;."g(q>:tl-.s;."g(:t)) ~ eS~g(:t) :tEE :tEE
~ e-n6 ~ eS~g(:t)
:tEE
proving (3.9). From the definitions (3.5), (3.6) and the relations
(3.8), ( 3.9) we obtain
a + Q"'(f,g ,~) 2 R"'(f,g ,~) 2 Q"'(f,g ,~) (3.11)
When ~ -+ 0 the number a can be chosen arbitrarily small and so
the two limits in (3.7) are equal.
It remains to show that
n(f,g) = limR"'(f,g ,~). £-+0
(3.12)
If a > 0 and 7 is an open cover with diam (7) ~ a then
(3.13)
Indeed, if E is an (c.>,n ,a)-separated set then no member of n-l
\./ (if(c.»)-17 contains two elements of E. Hence i=O
L: e.s;."g(:t) ~ 7T,~(f,9,7) XEE
proving (3.13). Now (3.13) together with (3.1), (3.6) and Detbition
3.2 yield
lim R"'(f,g ,~) s n(f,g). £-+0
(3.14)
-86-
On the other hand, we shall show that
(3.15)
where 6 is a Lebesgue number for a cover a and
Ta,=supflg(x)-g(y)1 :d(x,y)~diam(a)l. To prove this define
q::'(f,g,a)=inflL; infeSnug(z)lfJ is a finite subcover of BEIf Z EB
n-l
\/ (if(c.J))-lal. It is easy to see that i=O
(3.16)
o Next let 0 be a Lebesgue number for a. If F is a (c.J,n, '2)-
spanning set then
n-l M = U n B(if(c.J)x ,0/ 2).
ZE.F i=O
- . 0 Since each closed ball B(~f(c.J)x, 2)) is a member of a we have
q::'(f,g,a) ~ ~ e.s;:'g(z)
ZEF
and so q::,(f,g ,a) ~ Q::'(f,g ,6/ 2). By (3.8) we see from here that
(3.17)
Taking into account (3.16) this gives (3.15). But (3.15) implies
(3.18)
-87-
When diam (a.) ~ 0 then both T a ~ 0 and 6 = the Lebesgue
number of a. tend to zero. Thus we conclude from (3. 18) and
Definition 3.2 that
n(f,g) ~ lim RCil(f,g ,6). 6-+0
(3.19)
This together with (3.14) yields (3.12) and completes the proof of
Theorem 3.2 .•
-88-
Chapter ill
Random bundle maps.
In this chapter we shall prove some kind of the multiplicative
ergodic theorem which describes growth rates of the norms of vec
tors under the actions of compositions of independent random
bundle maps.
3.1 Oseledec's theorem and the "non-random" multiplicative
ergodic theorem
In this section we shall formulate Oseledec's multiplicative
ergodic theorem using the language of random bundle maps. Next
we shall compare it with the "non-random" mUltiplicative ergodic
theorem (Theorem 1.2) which will be proved in the remaining part
of this chapter.
First, we shall need some definitions. Let E be the direct pro
duct M x IRm of a space M possessing some measurable structure
and the m-dimensional Euclidean space IRm. If f is a measurable
map of M into itself then a pair F = (f ,:; F) is called a vector bun
dle map over f if F maps E into itself by the formula
F(x,() = (fx.:J F(x)(); x EM, (E IRm
where :JF(x) is a real matrix-valued measurable function of x.
-89-
The space of all vector bundle maps will be denoted by'tl'. We
shall assume that 'tl' is endowed with a measurable structure such
that the map 1l' x E ~ E acting by the formula (F,u) ~ F'u, U E E is
measurable with respect to the product measurable structure in
'tl' x E. If n is probability measure on 1l' then an 'tl'-valued random
variable F with the distribution n will be called a random bundle
map. By the definition any random bundle map F considered as a
pair (f,'yF) generates a random transformation f on the base M. We
shall keep the notation m for the distribution of f on the space If of measurable maps M into itself.
In what follows, we shall consider a sequence F 1,F2 , . .. of
independent random bundle maps with the same distribution n.
Clearly, the corresponding random transformations f 1.f2 , . .. act
ing on the base M will be independent, as well. We shall keep the
notations for the compositions
Throughout this chapter the probability space (O,p) will be
identified with the infinite product (1l''''',n'''') of the copies of (1l',n)
i.e., the points of 0 are the sequences (.) = (F1.F2 , ... ), Fi E't!' and
the measure p is generated by the finite dimensional probabilities
We shall also employ the shift operator ~ on 0 satisfying
Next, we define the skew product transformations
-90-
acting on E x nand M x n, respectively.
Let 1T be the natural projection of E = M x IR m on M. Then the
equality F = (f,:; F) means that I = 1TF1T-1 and so
( 1.3)
As in Chapter I consider the Markov chain x.... = In a ... a I1
XC = 1TFn a ... a F 11T-1XC whose transition probability P(x ,.) is given
by (1.2.6). Recall, that a measure 71 E AM) is called p' -invariant if
p' 71 = 71 where the transition operator P and its adjoint p' are
defined by (1.19) and (1.20), respectively. We shall use also the fol
lowing notations
and
n:; (x ,w) =:; a Tn -1 (x ,w) . . . :; a T( X ,w):; (x ,w ) . ( 1. 4 )
Now we are able to formulate a "random" version of Oseledec's
multiplicative ergodic theorem.
Theorem 1.1. Let F1,F2,··· be a sequence of "l'-vaLued
independent random variables with the common distribution n.
Suppose that n and a p' -invariant measure 71 E AM) satisfy the
following condition
(1.4)
with II . II denoting a matrix or a vector norm in IRm. Then for
-91-
7] x p-almost all (x ,CJ) there exist a sequence of linear subspaces
o c Vex ,OJ) c ... c vfx,OJ) = IR m ( 1.5)
and a sequence of values
such that
(1.7)
. '+1 . and if ~ E vex ,OJ) \ vex ,OJ) , where vex ,OJ) == 0 for all i > s, then
(1.8)
The functions s = s(x,CJ), mi(x,CJ) == dim Vix,OJ) - dim V{;,~) and
a i = ai(x ,CJ), i = 0, ... ,s (x ,CJ) are T-invariant i.e.
( 1.9)
for all i=O, ... ,s.
The subspaces Vix,OJ) measurably depend on (x ,CJ) and satisfy
(1.10)
If 7] has an ergodic decomposition in the sense of Corollary 12.1.
then the functions s ,mi and a i are independent of CJ. If 7] is
ergodic then s ,m i and at are constants. The numbers ai(x ,CJ) are
called characteristic exponents at (x,CJ) and mi(x,CJ) are their
multiplicities. Furthermore, if :J'1c (x ,CJ) is the k-th exterior power
-92-
of :J(x,w) then llog II.7'k a Tn-l(X,w)·· . .7'k(x,w)11 converges as n
n 4 co to the sum of the k biggest characteristic exponents counted
with their multiplicities. Jnparticular,
lim llog I det(:J a Tn-l(X ,w) ... :J(x ,w)) I n4DO n
s(x,w). . = l: mt(x ,w)at(x ,w) 11 X p-a.s.
i=O
(l.11)
The filtration (l.5) in Theorem l.1 depends on wand so if one
desires to obtain the limit a i in (l.8) he must take initial vector ~
depending on w i.e., random. In this section we shall formulate a
"non-random" multiplicative ergodic theorem establishing p-a.s.
limits of non-random initial vectors which yields the existence of
certain non-random filtration of subspaces similar to (l.5).
Let nm - l be the (m-1)-dimensional projective space i.e., the
space where any two non-zero vectors ~,( E IRm satisfying
~ = const ( represent the same point of the space nm - l which is
compact, by the way. We may identify points of nm - l with lines
passing through the origin of IRm and since all matrices from the
group @(b(m) of real invertible matrices send these lines to them
selves, we have a natural action of @(b(m) on TIm-l. In what fol
lows, we always consider bundle maps F = (j ,:JF) with
:J F(x) E l!l(b(m) for all x EM or at least, for almost all x with
respect to some measure, in question. Such bundle maps
F = (j ,'JF) act on nE 0= M x nm - l by F(x,u) = (jx,'J F(X)1.L). Still,
we do not assume the invertibility of transformations f on the
base M.
A sequence F 1,F2 , . .. of independent random bundle maps
with the common distribution n generates a Markov chain
Yn = nFYa, n = 1,2, ... on TIE with a transition probability
-93-
R(v,r) = nfF: Pu E: n = JXr(Pu)dn(F). (1.12)
We shall call a measure II E: ATIE) , n-stationary or R· -invariant
if
(1.13)
for any measurable r c TIE. Clearly, this definition would be the
same if we employ the adjoint operator R· defined by 0.2.9) with
the transition probability PC·) replaced by RC·) from (1.12).
Throughout the remaining part of this chapter we assume that
M is a Borel subset of a Polish space (i.e., a complete separable
metric space) which according to §§36-37 of Kuratowsky [31]
means that M is Borel measurably isomorphic to a Borel subset of
the unit interval. In fact, we shall need only that M can be con
sidered as a Borel subset of a compact space. As in the previous
section we suppose that the map (F,u) --> F'u, FE: rtr, u E: E is
measurable with respect to the product measurable structure in
rtr x E where E is considered with its Borel measurable structure.
Now we are able to formulate the main result of this chapter which
will be proved in the next two sections.
Theorem 1.2 Let F 1,F2 , . .. be a sequence of independent
random bundle maps with the common distribution. n acting on
TIE = M x TIm-I Assume that n and a p. -invariant ergodic meas
ure p E: AM) satisfy the condition
Then one can choose a Borel set Mp c M with p(Mp) = 1 so that for
any x E: Mp there exists a sequence of (non-random) linear sub
spaces
-94-
and a sequence of (non-random) values
- 00 < f3r(p)(P) < ... < f31(P) < Po(p) < 00
such that for p-almost all WEn,
and if ~ E .J! 1 " .J! 1+1 , where.J!1 == 0 for all i > r(p), then
p-a.s.
The numbers Pi (p) are the values which the integrals
11.9 F( x)ii: II 'l(V) == J J log ----- d vex ,u )dn(F)
I Iii: II
( 1.15)
(1.16)
(1. 17)
( 1.18)
(1.19)
take on for different ergodic measures v E np == ! v E AIlE) : v is n
stationary and 1TV = p! where ii: denotes a nonzero vector on the
line corresponding to u E Ilm-1 and 1T : IlE .. M is the natural pro
jection which acts on measures by 1TV(C)= V(1T- 1C) for any Borel
C eM. Furthermore, the dimensions of.J!~, i = 1, ... ,rep) do not
depend on x provided x E Mp and .J!i determines a Borel map
x -> L~ of M p into the corresponding Grassman manifold of sub
spaces of IR m i.e., .J! i = f.J! ~ I form Borel measurable subbundles of
Mp x IRm. These sub bundles are F-invariant in the sense that
p X n-a.s. (1.20)
where /= 1TF1T-1
-95-
Remark 1.1. Since M is a Borel subset of a Polish space then
by Proposition 1.2.1 any p' -invariant measure 'f/ has an ergodic
decomposition. This enables one to reformulate Theorem 1.2 for a
non-ergodic measure 'f/. Then the limits in (1.17) and (1.18) will be
some functions Pi (x) independent of CJ (in view of Corollary 1.2.1)
and satisfying Pi (Ix) = Pi (x) 'f/ x n-a.s.
Remark 1.2. From (1. 7) and (1. 17) one sees that
aO(x ,CJ) = (3o(p) if P x p-a.s. and so Po(p) is the biggest characteris
tic exponent corresponding to p.
Remark 1.3. In the case of a single (Le., non-random)
transformation the representation of characteristic exponents by
means of integrals (1.19) was noted by Ledrappier [33].
Remark 1.4. In view of the representation (1.19) the
numbers Pi (p) depend only on p and the distribution n but not on
the specific choice of the sequence F 1,F2 , ... of independent ran
dom bundle maps. Furthermore, the same can be said about the
filtration (.i i j. This is clear from the construction in Section 3.4
below which based on supports of n-stationary measures and their
linear spans and is not connected with specific actions of Fi .
The deterministic case when n is concentrated in one point is,
of course, a partial case of our situation. In this case Theorem 1.2
coincides with the first part of, Theorem 1.1 provided all
:JF(x) E 011(m) which is the standard version of Oseledec's multi
plicative ergodic theorem. An important feature of our proof of
Theorem 1.2 whi(;h we present in the following two sections is the
fact that we do employ either Kingman's subadditive ergodic
theorem nor Oseledec's multiplicative ergodic theorem.
Next, we shall compare the filtrations (1.5) and (1.15). Let p be
p' -invariant ergodic measure then by Lemma 1.2.2. and Theorem
-96-
1.2.1 the product measure p x p is T-invariant and ergodic. By
(1. 9) the functions a i and T-invariant and so
a i ~ ai(p = canst p x p-a.s. By Theorem 1.1 the hmits (1.17) and
(1.18) can take on only values fai(p)l and so the numbers !Pi(P)l
must be among !ai(p)l. Now let i l < i z < ... < 4(p) be such that
(1.21)
The connection between the filtrations of Theorems 1.1 and 1.2 is
given by
Theorem 1.3. P x p-almost surely .J! 1 c V?Z . .,) where i j is
defined by (1.21). Moreover p x p-a.s . .J! 1 is the maximal non
random (i.e .. independent of w) subspace of v{z . .,) for any i = i j •
i j + 1 ..... i j +1 - 1 in the sense that for p-almost all x there exist
no fixed vector t: satisfying p! W : t: E v?Z+.l~l \ .J! 1! > O.
ProaL Since. for any I; E .J! 1. the limit (1.18) exists and p
a.s. it is less than or equal to Pj(p). then, by Theorem 1.1,
I; E V?z . .,) p-a.s. On the other hand, if I; ft .J! 1, then the limit (1.18)
also exists but p-a.s. it is not less than f3j+l(P), and so by (1.8) p
a.s. I; can not belong to V(z . .,) for i < i j + 1· •
Remark 1.5. By induction it is easy to see from Theorem 1.3
that p x pk_a .s
k . dim (n V(Z . .,,))
i=l
~ max (dim .J! 1, dim v?Z.O>l) - k + 1)
where pk = P X ... x P is the direct product of k copies of p and so
it is a probability measure on the product Ok = 0 X ... x 0 of k
-97-
copies of O. Since e! 1 c v(X,Col) P x p-a.s. then for . . k
k > dim Vex ,Col,) - dim e! f one has n v(x,Colil = e! 1 p x pk_a .s . i=l
We shall discuss the situation in the following partial case.
Supposer that" is concentrated on the set of vector bundle maps
such that all matrix 'J F(x) are upper triangular i.e. we can write
where 0 means all zero elements, • denotes other elements and
{a}(x) I are diagonal elements. Let p be an ergodic p' -invariant
measure and
ci(i) = J J log I aj.i)(x) I dp(x )dn(F).
Notice that the triangular matrices have a family of invariant sub
spaces fi consisting of vector-columns having (m -i )-last coordi
nates equal to zero. It follows from the second part of Theorem 1.1
that the sum of k characteristic exponents related to r k coincides
with ci(l) + ... + ci(k) for k = 1, ... ,m. This implies that the set of
characteristic exponents lai I corresponding to the measure p
coincide with the set of numbers ci(i) taken in the appropriate
order. If
( 1.22)
then it is easy to see that the set of ci(i) coincides with the set of
values Pj(p) given by Theorem 1.2 and the set of subbundles e! j coincides with the set of products M x rio On the other hand, if
(1.22) is not true then, in general, not all of ci(i) can be realized as
Pj (p). The situation is especially simple in the two dimensional
case.
-98-
[aF(x)
Example 2.1. Let:J F(x) = .0
Put a =JJlog\aF(x)\dp(x)dn(F) and
b = J J log\bF(x)\dp(x)dn(F). The numbers a and b are the
characteristic exponents in this case and according to Theorem
1.2 they have corresponding directions with approximately eTta
and e nb rates of expanding (contracting). The matrices :J F(x)
have an invariant subspace r of vectors having second coordinate
zero. Clearly.
o
n 1:: ~ ... ak+lck bk - 1 ... b 1
k=l
bn ... b 1
( 1.23)
where we put a.;. = aF((i-1fx). bi = bF((i-1fx). Ci = CF((i-1fx) and
F 1.F2 • . .. are independent random bundle maps with the distribu
tion n. If a < b then for p-almost all initial points x any non-zero
vector from r grows with the speed eTta and any non-zero vector
from 1R2,\ r grows as e nb . In this case the filtrations from (1.5)
and (l.15) coincide. If a > b then one direction in Oseledec's
theorem is r and it is not random. This direction corresponds to
the growth rate eTta. From (l.23) it is clear that the direction
corresponding to the growth rate e nb is determined by the vector
[~~] with
-99-
and, in general, it is random. In the latter case b is not realizable
in the sense of Theorem 1. 2 and the limit in (1.18) will be always
equal to a. Of course, if c F(x) = 0 P X n-a. s. then we have diago-
nal matrices and both directions (b) and (~) are invariant and
non-random.
3.2 The biggest characteristic exponent.
We shall start with the following useful result from Fursten
berg [16].
Lemma 2.1. Let Zn be a Markov chain on a topological space
M having a transition probability P(x,·). If P is the correspond
ing transition operator and g is a bounded Borel measurable func
tion then with probability one
1 n-l - L (Pg(Z/c) -g(Z/c)) 4 0 as n -> "". (2.1) n /c=O
n 1 Proof. Put Wn+l = L -k--(Pg(Z/c) -g(Z/c+1))' Employing
/c=O +1
the conditional expectations we have
since by the definition of a Markov chain
(2.3)
But g (x) is bounded, say I g I < c, and so
-100-
since by (2.3),
(2.5)
Now (2.2) together with (2.4) imply that f Wn I forms a mar
tingale satisfying sup W~ < DO. Hence by the martingale convern
gence theorem (see Neveu [37] or Martin and England [36]) Wn
converges with probability one. Thus by Kronecker's lemma it fol
lows that with probability one
as n ~ DO and rearranging terms we ge the assertion of the lemma."
Now we can establish the foHowing key result.
Proposition 2.1. Let u: K ~ M be a Borel map of a compact
Hausdorff space K into a topological space M such that the image
of any Borel set in K is a Borel set in M. Suppose that ~ is a Mar
kov chain on M with a transition probability P(x,·) which forms a
Borel map of M into AM). Let TJ E AM) be a P·-invariant ergodic
measure. For functions on K define the following semi-norm
-101-
Let Q; be the closure of the space of continuous functions IC(K) on
K with respect to the semi-norm (2.6). Suppose that Yn is a Mar
kov chain on K such that u Yn = L~' the transition probability
R(u,') of Yn determines a Borel map of K into AK) and the
corresponding transition operator R acting by the formula
Rh(u) = jR(u,dv)h(v) maps the space G(K) into itself. Thenfor
any g E: Q;(K) there exists a Borel set v~g) C M such that
7]( v~g») = 1 and if uYo E v~g) then with probability one
1 n limsup -- ~ g(Yk ) ~ sup jgdll
n -+00 n + 1 k =0 IIE'fn., (2. 7)
where mTJ = !lI E P(K) : II is R' -invariant and Ull = 7] i.e.,
7](f) = lI(U-If) for any Borel r c mi. In particular mTJ is not empty.
Proof. Define JJ = !g : there exists a continuous function h
on K such that g = Rh - h I. If g E JJ then by Lemma 2.1 it follows
that the left hand side of (2.7) is equal to zero. The right hand side
of (2.7) in this case is zero too.
Notice that the definition (2.6) makes sense provided ha is a
Borel function on M as soon as h is a Borel function on K. But
ly : ha(y) > al = lY: \h(u)\ > a for some
U!u: \h(u)\ > al is a Borel set since U transforms any Borel set
to a Borel set and so ha(y) is a Borel function. Next, since 7] is
p' -invariant and ergodic then by Birkhoff's ergodic theorem
applied to the stationary Markov chain !Xk I (see, for instance, Sec
tion 2 in Chapter N of Rosenblatt [41] or Corollary 1.2.2 of this
book, which can be adapted to our situation) one has
-102-
1 n limsup L: h ( Yk ) 5:
n-+- n+1 k=O (2.6)
for 7]-almost all initial points a Yo.
Consider now a non-negative function g EO C;(K) not belonging
to .JJ and let
inf IIh - gil" = o. hEJ./
(2.9)
Then one can represent g = h + q" where h EO.JJ and
Ilq"II,,~o+~. Since ~ can be taken arbitrarily small, 1 n
limsup -- L: h(Yk)=O for 7]-almost all aYo then it follows n +1 k=O
from (2.8) that
1 n limsup-- L: g(Yk)~o for 7]-almost all a yo. (2.10)
n .. - n+l k=O
Define a linear functional e. on the direct sum of .JJ and the one
dimensional space (g I generated by g setting e. I.JJ = 0 and
e. (g) = o. Since 0 EO .JJ then by (2.9) one has II gil" ;?: 0 and so for
all h EO.JJ EEl !g I
(2.11)
Hence by the Hahn-Banach theorem (see Hewitt and St,omberg,
[19]) there exists a continuous linear functional e. on the space of
functions with finite semi-norm (2.6) such that e. vanishes on fi, e. has the norm not exceeding 1, i.e., (2.11) remains true, and e. takes on the value 0 at g.
-103-
Clearly, Ilh 117)":;; Ilh II where Ilh II = sup Ih(u) I· Hence e is 'UE/(
also a continuous linear functional on the space rc(K) of continu
ous functions on K with the supremum norm and with respect to
this norm e also has the norm not exceeding 1. Thus by the Riesz
representation theorem (see [19], Theorems 12.36 and 20.48) there
exists a signed measure A with full variation not exceeding 1 and
representing e as an integral
e (h) = J hdA K
(2.12)
for any h EO rc(K). If A is decomposed into its positive and negative
parts A = 71.+ - 71.- then
(2.13)
Since I Jhd A I = Ie (h) I ,.:;; II h 117/ for any continuous function on K
then A(r) = 0 for each Borel set r c K satisfying 71(ar) = o. But J..+
and A-are mutually singular and so A +(r) = A -(f) = 0 if 71( ar) = o. Hence by the Radon-Nikodim theorem (see [19]),
(2.14)
for some Borel functions '1'+ and '1'-.
Let now q E C:(K) be a bounded function. Then one can
choose a sequence of continuous functions hk E rc(K) such that
Here, recall, II . II is the supremum norm. It follows from above
that there exists a subsequence !hk ,! such that hkt(u) --> q(u) as
-104-
i ~ co provided au does not belong to some exceptional set reM
satisfying 1](r) = O. By (2.14) this means also that
(2.15)
Since the functional e is continuous in the semi-norm II . 1171 then
by (2.12), (2.13),(2.15) and the Lebesgue bounded convergence
theorem one obtains
(2.16)
i.e., (2.12) remains true for any bounded q E e:;(K).
The transition operator R maps C;(K) into itself and as any
Markov operator preserves boundedness (it even has the
supremum norm not exceeding one) then
J(Rq - q)d"A = e(Rq - q) = 0 (2.17)
for any bounded q E e;-(K) and so "A is R· -invariant. Therefore
R·"A+ - R·"A- = A+ - "A-. Since"A+ and "A- are mutually singular it
follows from here that R·"A + ~ "A + and R·"A - ~ "A -. This implies that
R·"A±="A±(K) and so R·"A± = "A±. We have proved that"A+ is R·
invariant, hence u"A + is p. -invariant. Since 1] is ergodic and p.
invariant, one concludes from (2.14) that rp+ = c = canst 1]-a.s.
,Define v = c-1"A+. Clearly, c = "A+(K) ~ 1, uv = 1] and v is R·
invariant i.e. v E M7J'
Let now g E e:;(K) be a bounded non-negative function satisfy
ing e(g) = O. Then
(2.18)
-105-
and by (2.10) we obtain (2.7). To derive (2.7) for g not being neces
sarily non-negative but still bounded one adds to g a constant
making it non-negative while the left and the right hand side of
(2.7) increase by the same constant.
Take now an arbitrary g E: C;(K) and for r = 1,2, ... define
g(r) = max( -r ,min(g ,r)). Then, clearly, g(r) E: e;(K) and we con
clude from above that (2.7) is true for g (r) is in place of g. Since
Ilg - g(r) 111) -> 0 as r --> 00 then by (2.8) it follows that the limsup
in question taken for g and g(r) is almost the same in both cases
provided r is big enough. On the other hand
(2.19)
and so the left hand side of (2.19) is small provided r is big
enough. The above arguments taken together yield (2.7) for any
g E: C;'(K). This completes the proof of Proposition 2.1. •
Applying Proposition 2.1 to g and -g one obtains
Corollary 2.1. If under the conditions of Propositions 2.1 the
integral J gd /.I takes on the same value tJ for all measures from
m1) with TJ E: P(M) ergodic and p. -invariant then there exists a
Borel set v~g) c M such that 1]( v~g)) = 1, with probability one the
limit (2.7) exists and
(2.20)
provided uYo E: v~g).
-106-
We shall need Proposition 2.1 only under condition that K is a
direct product of M and another compact metric space. In this
case we are able to specify the structure of G(K).
Lemma 2.2. Let under the circumstances of Proposition 2.1
one has K = M x B where B is a compact metric space, M is a
Borel subset of a compact space and u : M x B ~ M is the natural
projection on the first factor. Then the space G(K) defined in Pro
position 2.1 is exactly the set of Borel functions g (y ,y) on M x B
with finite semi-norms II . 111/ and continuous in u E B for W
almost all y. Furthermore for each g E c.:;(M x B) the assertion
(2.7) remains true.
Proof. In our circumstances K = M x B is not necessarily
compact but its closure K = M x B is already compact and we can
use the fact that 1}(M) = 1. Next, we can extend the Markov chains
Xn and Yn from K into K by saying that ~ '=' y and Yn '=' (y ,u) for
all n provided Xo = Y E K \. K and Yo = (y ,u). Since 1} is p'
invariant then for 1}-almost all points x E M the process ~
remains in M and the process Yn remains in K with probability
one for all n provided Xo = x. Now applying Proposition 2.1 to the
Markov process ~ and Yn on the compact K and taking into
account that 1}(M) = 1 we shall actually obtain an assertion about
Markov processes and measures on K only.
It remains to prove the statement about the structure of
C':;(K). Let hi E C(K) be a sequence of continuous functions on K
and Ilhi-q 111/ ~ 0 as i -+ 00. Then (hit-q)u(y) ~ 0 as j -+ DO for 1}
almost all y. Hence for these y,
lim sup I~(y,u) -q(y,u)1 = 0 :J ..... OO UEY
(2.21)
and so q (y ,u) is continuous in u.
-107-
To prove the result in the other direction take in B a sequence
of points !ud such that u 1, ... ,uk(n) form an ..!..-net in B i.e., the n
union of the balls Bl/n(U£) = !u : dist(u,U£):S 1..1 covers B. n
Choose continuous functions 'Ptnl(u) = 0 if dist(u,ud ~ ~ and n
k!!!-l O:s 'Ptnl(u):s 1 for all u. Put 'IjItn)(u) = 'Ptn)(u)( L; 'Pjnl(u))-l. Let
j=l
g(y,u) be a Borel function on M x B with IIg II." < co and continu
ous in u for 1]-almost all y. Define
gn(y,u) = ~ g(y,ui)'ljli(nl(U). lsisk(n) (2.22)
Since ~ 'IjIi(n) = 1. it is easy to see that for those y where g (y ,u) is i
continuous in u one has
sup I gn (y ,u) - g (y ,u) I .... 0 as n .... co. UEB
Since II gn 117I:S II gil." < co then by the Lebesgue convergence
theorem
Ilgn-g II." (2.23)
=Jsup Ign(y'u)-g(y,u)ld1](y) .... O as n .... "" M UEB
On the other hand gn is a finite sum of functions of the form
gCi)(y)..pln)(u), where gCi)(y) are Borel functions on M with
II g Ci) II." < "" and ..pIn) are continuous. But for any Borel function q
on M satisfying Jlq Id1] < "" one can find a sequence of continuous
functions hn on M such that JI q - hnl d1] .... 0 as n .... co. Collect
ing all these together one can construct a sequence of continuous
functions on M x B which converge to g in the seminorm II . II."
-108-
proving Lemma 2.2 .•
Next we shall go back to the circumstances of Theorem l.2.
Define on Z == M x TIm-l x 1l' the following function
II.9 F (x)ull q (x ,U ,F) == log -Tlilfl (2.24)
where, again, U E IR m is a non-zero vector on the line correspond
ing to U E TIm-l. Sometimes we set w = (x,u) E TIE and then sim
ply wri te q (w, F). Since 'J F(x) acts linearly and the norm of vec
tors is a continuous function on IR m then q (x ,U ,F) is continuous in
u. Notice that
and so
Hence if p and n satisfy (1.14) then
J SUD I q (x ,U ,F)/dp(x )dn(F) < 00. t> EJIIi>-t
From the definition (2.24) it follows that
1 n k E q ( FW,Fk + l ) n +1 k=O
1 ~
= -logll'J F (nfx)" ·.9F(x)-~11 n ,,+( 1 Ilull
(2.25)
(2.26)
(2.27)
(2.28)
provided W = (x,v,) E TIE. Denote q(w,w) == q(W,Fl(W)) so that q is
-109-
a function on ITE x O. Employing the action Fi(c.J) on ITE we intro
duce similarly to (1. 2) the skew product transformation
(2.29)
acting on I1E x O. Then one can write
(2.30)
Next, assume that v E AIlE) is an ergodic n-stationary meas
ure. Then replacing M,P" and T by IlE,R" and T in the random
ergodic theorem (Corollary 1.2.2) we obtain from (2.27), (2.28) and
(2.30) that for v-almost all w = (x ,u)
where 7(V) is defined by (1.19). If 1rv=pE{J(M) then (2.31)
implies that for p-almost all x E M
liminJ _1 -logll.] ... (nfx)" ·.]F(x)11 ;;;:7(V) p-a..s. n .... - n+1 r,,+! 1
But then also
(2.32)
~ ~~ 7(V) pxp-a.s.
where rip was introduced in Theorem 1.2. Indeed, rip is a compact
set in the week topology and each measure from rip according to
Proposition 1.2.1 has an ergodic decomposition. Thus one can
choose a sequence of ergodic measures vi E rip such that vi -+ 'j;1 in
-110-
w the weak sense (vi -+ v) and ~im ,,(vi) = sup ,,(v). Moreover by
" ..... 00 ,,~
(2.27) one obtains ,,(v) = sup,,(v). Since v also has an ergodic venp
decomposition one concludes from here that there exists an ...
ergodic measure v E: np such that ,,(v) E: np such that ...
,,(II) = sup ,,(v). This implies (2.32). venp
Now we are going to show that, in fact, the limit in (2.32) exists
and it is equal to the right hand side of (2.32). Define
qN = max( -N,min(N,q)) and QN = J qNdn (2.33)
so that QN is a function on TIE. First, we shall see in the same way
as in Lemma 2.1 that v x p-a.s.
Indeed, let Yo be a TIE-valued v-distributed random variable
independent of all F l ,F2, .... Put
Taking the conditional expectations one has
(2.35)
because Fn+l is independent of YO,F l , ... ,Fn and so the last condi
tional expectation is equal to J qN(nFYo,F)dn(F) = QN(nFYo)·
-111-
Hence { Wn l forms a martingale. In the same way as in (2.4) we can
see that IgNI ~ N implies ~W; ~ 4H2 f; ~ < 00. Now the same I:=! k
arguments as in Lemma 2.1 concerning the martingale conver
gence theorem and Kronecker's lemma yield
(2.36)
with probability one. If we consider the sequence Yo,F!,F l , ... on
the probability space (TIE x O,llXp) then (2.36) implies (2.34).
We already noticed that the function q (x ,U ,F) is continuous in
u. Then, clearly, QN(x ,u) = ! qN(x ,U ,F)dn(F) is also continuous in
u. We intend to apply (2.7) to the Markov chain Yn = n FYo on the
space TIE = M x rrm - l and the bounded function QN in place of g in
(2.7). Since M is a Borel subset of a Polish space and so can be
treated as a Borel subset of a compact space (see §36-37 of Kura
towski [31]) and since TIm-l is compact we can employ Lemma 3.2
provided the transition operator R of Yn preserves the space of
functions h (x ,u) on M x rrm - 1 continuous in U and have finite
semi-norm II . lip.
To show this, notice that each :J F(x) acts continuously on
rrm - 1 and so if h(x,u) is continuous in u then so is Rh(x,u).
Furthermore, by (1.12)
Isup I Rh (x ,u) I dp(x) ~ Isup I h (fx ,u) I dp(x )dm(f) u u
= !suplh(x,u)ldp(x) == Ilh lip u.
since p is p. -invariant, where, recall m is the distribution of
fi = 1TFi 1T-1 and P is the transition operator of x,.. = 1TYn .
-112-
This says that R transforms the closure of the space IC(TIE) with
respect to the semi-norm II . lip into itself. Other conditions of
Proposition 2.1 are, clearly, satisfied, as well, and its application
yields that for p-almost all initial points Xo = rrYo,
1 n limsup --- ~ QN(Yk ) ~ sup J QNdv p-a.s.
n ...... n + 1 k =0 VEn, (2.37)
In other words, for p-almost all x EM and all u E TIm - i ,
(2.38)
~~~ J J qN(w,F)dv(w)dn(F) p-a.s.
From (2.26) and (2.33) it follows that
(2.39)
where
Hence, for w = (x ,u)
(2.40)
-113-
In the right hand side of (2.40) we have some expressions of the
form g (kfx ,Fk +l ) where g (x ,F) is a function on M x If' Setting
g(x ,CJ) 0= g (x ,Fl(CJ)) we get
where T is given by (1.2). This together with (1.14) enables us to
employ the random ergodic theorem (Corollary I.2.2) to conclude
that the right hand side of (2.40) converges p x p-a.s. to the limit
J(log+lI.:7 F(x) II + log+lI.9 p;l(x) IDdp(x)dn(F). BN
(2.41)
By (1.14) the last expression tends to zero when N 4 00. This
together with (2.28), (2.34) and (2.39) - (2.41) yield for p-almost all
x and all U E rrm-l,
~ SUP7(V) p-a.s. vE'fJp
(2.42)
All matrix norms in IR m are equivalent. Hence the limiting
behavior of llog II.:7F (nfx)···.:7 F (x) II will not depend on a n 1\+1 1
choice of the norm. If (~i l form an orthonormal basis of IR m we can
identify the norm of any matrix g with max II g ~i II. Fix some ~
x EM for which (2.42) holds. For any sequence ne ~ 00 there exists
a number j and a subsequence ne, 4 00 such that
-114-
for a set of CJ having p-measure not less than ~. Since the inem
quality (2.42) holds p-a.s. for any ~j in place of u then it follows
from here that p x p-a.s.
Combining (2.32) and (2.43) together with Corollary 2.1 one obtains
Theorem 2.1. Let p E AM) be an ergodic p. -invariant meas
ure satisfying {l.14}. Then there exists a Borel set Up c M with
p(Up ) = 1 such that p-a.s.
lim llog 115 F • (nfx) .. ·5 F (x)" = sup 7(V) == fJo(p) (2.44) ", ... CIOn 1101 1 II~
provided x E Up. Furthermore, if for aLL n-stationary measures
v E rip the expression 7(V) takes on the same value fJ then
(2.45)
for any nonzero ~ E IR m provided x E Up.
Remark 2.1. As we already pointed out in the proof of (2.32)
there exists an ergodic n-stationary measure vp E AilE) such that
7(vp ) = sup 7(v) i.e. the limit (2.44) can be represented as an vE17p
integral.
Remark 2.2. The ergodic decomposition, yields that if the
integrals 7(V) are the same for all ergodic v E rip then they are the
-115-
same for all measures from rip. Therefore to have (2.45) it suffices
to require that 7(11) takes on the same value fi for all ergodic n
stationary measures.
3.3. Filtration of invariant subbundles.
In this section the a-algebra of Borel subsets of M is completed
by the sets of p-measure zero, where p is the same as in Theorem
1.2. Each object which is measurable with respect to this com
pleted a-algebra can be made Borel measurable by changing it on
a set of p-measure zero. By this reason the difference between
"Borel" and "measurable" ("p-measurable") will not be important
here and we do not pay attention to it.
In what follows we shall need the notion of Borel measurable
subbundles of E and TIE. Let m (x) be a positive integer-valued
Borel function on M. Set Uk = {x : m(x) = kl. We shall say that J is a Borel measurable subbundle of E = M x IR m corresponding to
the function m (x) if J = u (x ,J z) and the map x ~ J z res-zEIl
tricted to each Uk is a Borel map of Uk into the Grassman manifold
of k -dimensional subspaces of IR m (see Hirsch [20]). In other words
this means that k -dimensional subspaces J z , x E: Uk depend
measurably on x. This actually says that there exist k Borel
measurable vector fields ~1, ... ,~k such that ~i, ... ,~: form an
orthonormal basis of J z for each x E: Uk. Indeed, for each k
dimensional subspace J (0) OIle can choose a neighborhood W of
J (0) in the corresponding Grassman manifold and k-vector valued
functions ~l(J), ... ,~k(.J:) defined and continuous when J E: W
such that the vectors e(J), ... ,~k (J) are orthogonal for each
J E: W. To do this pick up an orthonormal basis ~J, ... ,a of J 0
and take its orthogonal projections on every.J: E: W. If W is small
neighborhood of J 0 then for each J E: W we shall get a basis of J (not necessarily orthogonal). Then the Gram-Schmidt
-116-
orthonormalization process will lead us to desired functions
tl(of), ... ,ec (of). Since the Grassman manifold is compact we
shall need only finite number of such neighborhoods W which
enables us to construct orthogonal vectors t 1(of), ... , tk (of)
depending on of measurably and defined already for any k
dimensional subspace of. Hence if of z depends on x measurably
then so do ~i == ~l(of z), ... ,t: = ~k (of z) and we obtain measur
able vector fields, in question.
Notice right away that the vector fields e, ... ,tk give a Borel
isomorphism of of restricted to Uk with the direct product
Uk x IRk. To obtain this isomorphism one chooses first some ortho
normal basis {"l, ... ,~ of IRk. Then any point (x,~)Eof with
~ E of z corresponds to a point (x, (") E Uk X IRk such that TJ E IRk
has the same coordinates with respect to the basis l~ I as t has
with respect to f~~I. This isomorphism preserves the length of all
vectors and so all limits we are interested in here will remain the
same. This is the reason that we can restrict ourselves to the case
of trivial vector bundles.
Let 1.J(z, x E Ml be a family of Borel subsets of rrm-l. We shall
call .J( = u(x ,.../(z) a Borel measurable subbundle of rrE if
A. ~ '" .J( = U(x,.J(z) with .J(z == u a is a Borel measurable subbundle
z "!LEY.
of E where, recall,a E IRm is a vector in the direction of U E rrm-l.
Next we shall need the following construction. Let v E PerrE).
The natural projection 7T: rrE = M x rrm- l -+ M is, clearly, a Borel
map. Since both M and rrm - l are metrizable compact spaces then
by the desinlegration theorem (see Bourbaki [8], ch. 6 §3 n. 1,
Theorem 1) one can write
v = Jvz dp(x) (3.1)
-117-
where p = 7f 1/ E AM) and the family of probability measures
! I/z, x E Ml on rrm - 1 is determined by (3.1) uniquely for p-almost
all x. Moreover the map fil: M .... Arrm - 1) given by the formula
'i/(x) = I/z is a Borel map provided Arrm - 1) is considered with the
topology of week convergence of measures. The representation
(3.1) is connected also with the theory of measurable partitions
(see [Rohlin [40]) since 7f generates the partition of rrE into the
elements (x ,rrm-l).
v For any set of non-zero vectors r in IRm denote by r the
corresponding set of points in rrm - 1 i.e., the set of all directions
represented by r. For any measure v E Anm - 1) denote by of (v) v
the minimal linear subspace of of IRm satisfying v(of) = 1.
Lemma 3.1. Let 1/ E ArrE), of (1/) = U(x ,of z (1/)), z
of z (1/) = of (I/z) and the family !I/z I is defined by (3.1). Then of (1/)
forms a Borel measurable sub bundle of E = M x IRm.
Proof. Since I/z measurably depends on x it remains to
show that n (v) = dim of (v) is a Borel function on Anm -1) con
sidered with the topology of weak convergence and that on each
4 = !v E Arrm - 1) : n(v) = k I the map v .... of (7) of p(rrm - 1) into
the Grassman manifold is measurable.
We shall show even more. Namely, n(v) turns out to be lower
semi-continuous i.e., each [v: n(v) ~ k I is closed and, besides,
of (v) depends continuously on von each t.\:.
To do this consider v,ve E Arrm - 1) Q = 1,2,
ve -+ 1/ in the weak sense. Then it is easy to see that
such that
-118-
supp v C () U SUpp Vj' i=l j;;"i
Indeed, if Q is a closed subset then it is a standard fact that
w limsup vee Q) ~ v( Q) provided ve 4 v. e __ oo
Hence
which gives (3.2).
v
(3.2)
Since of (11) ::> supp 11 for any 11 E p(TIm-l) it follows (from (3.2)
that for any u E supp v there exists a subsequence e 4 00 and
points U E supp ve, such that ui ~ U as i 4 00. Hence if ~l' ... '~r
is a basis of of (v) then there exists a subsequence e i ~ 00 and vec
tors ~fi), ... , ~~i) E of (ve,) such that ~?) 4 ~j as i 4 "". This
already implies that liminj neve ) ~ n(v) and the set of all limit i....,.oo l
points of the subsequence of (ve) contain of (v). These arguments
applied to each such sequence veJ in place of the whole sequence
ve yield that
liminJ neve) ~ n(v) e-+oo
(3.3)
and the set of all limit points of the sequence of (ve) contain of (v).
But (3.3) means lower semi-continuity and if all neve) are the
same as n (v) then of (ve) 4 of (v) as e ~ "" in the natural sense.
This gives the continuity of of (v) on each 4 and completes the
proof of Lemma 4.1. •
Lemma 3.2. Let p E AM) be an ergodic p' -invariant
-119-
measure and II E: np where np is defined in Theorem 1.2. Then there
exists a Borel set V(II) eM such that II( V(II)) = 1 and
(3.4)
for n-almost all F provided x E: V(II) where of %(11) is defined in
Lemma 3.1 and, recall, f = rrFrr-1 Furthermore, the dimension
n%(II) = n(II%) of of %(11%) of of %(11) is a P-invariant function and so
it is equal to a constant for p-almost all x.
v v Proof. Put of (II) = U (x ,of % (II)) then
%
v v 1 = lI(of (II)) = n * lI(of (II)) = (3.5)
v = J J 1I,A:J;l(x) of /%(II))dp(x)dn(F).
It follows from Lemma 3.1 that in the last integral we integrate a
measurable function and so it makes sense.
Next (3.5) yields that
v 11% (:J;l(x) of /%(11)) = 1 p x n~a.s. (3.6)
i.e., for p-almost all x and n-almost all F. Then by the Fubini
theorem one can choose a Borel set V(II) with p( V(II)) = 1 such
that (3.6) holds for any x E: V(II) and n-almost all F. By the
minimality of of % (II) we conclude from here that for these x,
"-a.s. (3.7)
Then the dimension n% (II) of of % (II) satisfies
-120-
n-a.s. (3.8)
Since Pnx (v) = J n!x (v)dn(F) and p is p' -invariant then one
derives from (3.8) that
and so nIx (v) = nx (v) p X n-a.s.
This implies
(3.9)
Since p is ergodic (3.9) yields
nx(v) = canst p-a.s. (3.10)
This together with (37) give (4.4) provided x E V(v) proving
Lemma 3.2 .•
As in the statement of Theorem 1.2 we shall say that a measur
able subbundle J! = u(x ,J! x) is F-invariant p x n - a.s. if x
'JF(x)J! x = J! nF7r-l P X n-a.s. (3.11)
If J! = u(x ,J! x) is F-invariant p x n-a.s. and p is an ergodic p'-x
invariant measure then the dimension dx (J!) = dim J! x satisfies
p-a.s. and so
-121-
dx(J!) "" d(p,J!) = canst p-a.s. Then, as we explained it above, the
subbundle J! restricted to some Borel set U (J!) c M with
p(U(J!)) = 1 is measurably isomorphic to the direct product
U eJ..l ) x IRd(p,oD. This isomorphism is carried out by means of
d (p,J!) measurable vector fields ~1, ... , ~d(P,oD suc h that for each
point x the vectors t~, ... , ~g(P,oD form an orthonormal basis of
J! x' Choosing an orthonormal basis TJ1, ... ,TJd(P,oD of IRd(p,,t) one
obtains an isomorphism by mapping points (x,O E J!, X E 1./ (J!)
to (x,TJ) E U (J!) x IRd(p,,t) provided TJ has the same coordinates wi th
respect to fTJ i ! as ~ has with respecllo !~~!. This isomorphism can
be represented by some family of linear maps JJ...x) : J: x 4 IRd(p,,t)
defined for x E1./(J!) and such that (X,~)EJ! corresponds to
(x .JJ...x )~) E U (J:) x IRd(p,,t).
~
Next, for each F E'Ir we shall define the corresponding action
pi: on E(p,J!) "" M X IRd(p,oL') in the following way. If the pair (x ,F)
satisfies (3.11) and x EU(J:) then pi(x,TJ) =0 (fx,:}f(x)TJ) where
f = 7TF7T-1 and
(3.12)
If the pair (x ,F) does not satisfy the above conditions then we also
set F(x:TJ) "" (fx,:}/(x)TJ) but in this case :Jf(x) is the identity
matrix. As the outcome we obtain a random bundle map pi on
E(p,J! ).
The following equality of Euclidean norms follows immediately
from the construction
Since p is p' -invariant one derives from (3.12) and (3.13) that p-
a.s.
-122-
(3.14)
provided x E 1/(of) and ~ E of % where. recall, F 1F2 •... are mutu
ally independent with the common distribution .. and
if = 7TFi 0 ••• 0 F l 7T-1. Indeed, with probability one all if X E U(of).
i = 1.2.··· provided x E1/(of).
Consider now the Markov chain y;f = F;f 0 ••• 0 Frvt on
TIE(p.of) = M x TId(p....(')-l where yt is a TIE(p.of )-valued random
variable independent of all F 1,F2 , .... Here F.f acts on IlE(p.of) by
the formula F.f(x ,u) = (fx.:J 'I(x)u) where the linear transforma
tion :J I(X) naturally acts on the projective space. as well. Next.
we can apply the same arguments to the Markov chain y;f as we
did it in the previous section with respect to the Markov chain Yn
in order to obtain Theorem 2.1. These together with (3.14) yield
that with probability one
(3.15)
provided x E 1/ (of) where p(P.of) is some (non-random) number.
The value p(p.of) characterizes the rate of growth of compositions
of random bundle maps along an F-invariant p x .. -a.s. measurable
subbundle .1 .
Let .1' = U(x.J 'x) and .1" = U(x.J;) be two F-invariant
p X .. -a.s. measurable subbundles. Then we can introduce the fol
lowing partial order:
p-a.s. (3.16)
-123-
This yields also an equivalence relation
J!' ~J!" iff J!' >-J!" and J!" >-J!'. (3.17)
Clearly,
if J!' >- J! " then (3(p,J! ') ~ (3(p,J! ") (3.18)
and d (p,J! ') ~ d (p,J1 " )
where, recall, d (p,J!) is the common dimension of J! x for p
almost all x (p-ergodic!).
Let J!' + J!" = U(x ,J! 'x + J!~) be a measurable subbundle x
with fibres J! 'x = J! ~ which denotes the minimal linear subspace
of IR m containing both J! 'x and J!;. Then it is easy to see that if
J!' and J!" are F-invariant p x n a.s. then J!' + J!" is F-invariant
p x n-a.s., as well. Moreover
d (p,J!' + J! ") ~ max(d(p,J! '), d(p,J!")) (3.19)
and (3(p,J!' + J!") = max({3(p,J! '),{3(p,J!" )).
If neither J!' >- J!" nor J!" >- J!' then the first inequality in (3.19)
is strict.
Let II E rip and p E AM) are both ergodic measures. By Lem
mas 3.1 and 3.2 the subbundle of (II) is measurable and F-invariant
p x n-a.s. Since II(J! (II)) = 1 it follows from (2.30) that
p(p,of (II)) = 0:(11). (3.20)
Denote by ~ the collection of all F-invariant p x n- a.s. measurable
-124-
subbundles J! satisfying P(p,J!) < Po(p) where Po(p) is defined in
(2.38). If ~ is empty then by (3.20) and Remark 2.2 it follows
that 'l(v) takes on the same value for all v E: rip. Therefore by
Theorem 2. 1 we obtain that the limit (1.18) is always equal to
Po(p) and so Theorem 1.2 follows with rep) = 0 i.e., the filtration
(1.15) is trivial.
Suppose now that ~ is not empty. Then it follows from
Theorem 3.1 that there exists v E: rip with J! (v) E:~. Notice that if
J!' and J!" are F-invariant p x n-a.s., J!' >- J!" and
d (p,J! ') = d (p,J!") then J!' ~ J!". Since d (p,J!) ~ m then one
can see from (3.16) - (3.19) that ~ has a maximal element J! max
which is uniquely determined up to the equivalence.
It is clear that in each linearly ordered chain subbundles J!
from ~ the dimension d (p,J!) can jump not more then m times.
Hence this chain can contain not more then m different up to the
equivalence subbundles. Therefore
Pl(P) "" P(p,J! max) = ~~ P(p.J!) < Po(p)· (3.21)
Now we are going to show that J! max can be taken as J! 1 in
(1 15) i.e. for p-almost all x if ~ E: IR m \. J! £lax then p-a.s. (cf. Pro
position 3.8 of Furstenberg and Kifer [17])
To prove (3.22) we shall need the following construct.ion. Sup
pose t.hat J! is an F-invariant p x n-a.s. measurable subbundle.
Consider the factor E/ J! where each two points (x ,~) and (x ,() of
E are identified if ~ - ( E: J! x' In this way we shall obt.ain a vector
bundle over the base M with the fibres IR m / J! x' If
-125-
'J F(X),,1 x = ,,1 lx' f = rrFrr- 1, then 'J F(x) naturally acts on
IRm / ,,1 x' as well, transforming it into IRm / ,,1 Ix' Since ,,1 is F
invariant 11 x p-a.s. then
(3.23)
The elements of IRm / ,,1 x can be written symbolically in the form
~ + ,,1 x' where ~ E IRm. Then the above action is given by the for
mula 'J F(x)(~ + ell x) = 'J F(x)~ + ,,1 Ix'
Since ,,1 is measurable one can choose rn measurable vector
fields ~l, ... ,~m on M such that ~i, ... ,e: for each x form an
orthonormal basis of IRm and ~i, ... ,~g(P,of!J form an orthonormal
basis of ,,1 x for p-almost all x. In the same way as we have
obtained above a measurable isomorphism of ,,1 restricted to
U (,,1) eM with U (,,1) X IRd(p,of!J one can construct a measurable
isomorphism of E/ ,,1 restricted to U (,,1) with U (,,1) x IRd(p,of!J by
means of the vector fields ~d(P,ofJ, ... ,~m.
Define the norm 11I~+,,1xlll of ~+,,1x ElRm/,,1 x as the
Euclidean distance of ~ from ,,1 x i.e., III~ + ,,1 x III = inf II~ + (II· {E,t"
This definition is correct and if (3.23) holds one obtains also the
norm on 'J F(x) as
111'JF(x)lll= sUp' 111'JF(x)~+,,1lxlll. (3.24) ~:lll~+ol',; 111= 1
Since l1-a.s. if X = rriFrr-1 E 11 (,,1) for all i = 1,2, . .. provided
x E 11(,,1) then in the same way as above we can apply argl:ments
of the previous section to obtain that p x p-a.s. the limit
(3.25)
-126-
exists and it is non-random.
In the same way as Lemma 3.6 of Furstenberg and Kifer [17]
one proves
Lemma 3.3. Let ef be an F-invariant J.L x p-a.s. measurable
subbundle then
Po(p) =maxlP(p,ef), P(p,EI ef)l (3.26)
Proof. Choose, as above, m measurable vector fields
fl, ... ,~m on M such that ~i, ... ,~;;" for each x form an orthonor
mal basis of IR m and ~i, ... ,~:f(p'.J'J form an orthonormal basis of
ef x for all x E. U (ef). Here U (ef) is a measurable subset of M
such that p(U (ef)) = 1, for any x E. U (ef) the relation (3.11) holds
n-a.s. and d x (ef) = d (p,ef). For those x and n-almost all F the
matrices 'J F(x) have in the above basis the following form
where 'J 'f!(x) are submatrices, 'J fl(x) corresponds to the restric
tion of 'J F(x) to ef x and :J j.2(x) corresponds to the action of
'J F(x) on IR m I ef x'
It is easy to see that
P(p,ef) ~ Po(p) and P(p,EI ef) ~ Po(p)· (3.27)
Now suppose that both inequalities in (3.27) are strict. Since p
is p. -invariant then i fx E. 11 (ef ) p-a.s. for alli = 1,2, . .. pro
vided x E. U (ef ). Therefore for any x E. U (ef) with probability one
-127-
E" (X)] e" (X) .
Let l; > o. When N is large, then with probability p close to one
for all X and y belonging to some set Q~N) satisfying
p(M '\ Q~N)) ~ oN --> 0 as N --> 00 one will have
(3.29)
Since p is p. -invariant then
= f ... f p((f N° ... of l)-l(M '\ Qfj))dn(f 1) ... dn(f N)
(3.30)
and so
(3.31)
Hence there exists a measurable subset 'Q~N) c Q~N) such that
p(M '\ 'Q~Nl) ~ ON + -VON provided x EO 'Qyl. Thus by (3.28) and
(3.29) it follows that for x EO 'Qtl with probability p close to one
(3.32)
-128-
+ e (1+e)N(flo(p) + (J(P,EI "t.)).
On the other hand if N is chosen big enough then by (2.44)
with probability p close to one for all x belonging to some set
Q~2N) satisfying p(M'\ Q~2N)) ~ 62N 4 0 as N 4 00 it follows that
(3.33)
Since p(Q~N) n Q~2N)) ~ 1 - 6N - ...j6N - 62N then we conclude
from (3.32) and (3.33) that both inequalities in (3.27) cannot be
strict. This completes the proof of Lemma 3.3. •
Now we are able to prove (3.22). Since by (3.21) one has
(J(p,.,1 max) < (Jo(p) then by (3.26). (J(p.E/.,1 max) = fJo(p), Applying
the arguments of Theorem 2.1 to the vector bundle E/.,1 max we
conclude that either (3.22) holds p x p-a.s. provided
f E IR m \. .,1 r'axor in view of Lemma 3.2 there exists an F-invariant
p x n-a.s. non-trivial measurable subbundle ..A of E/ .,1 max with
(J(P . .AJ < fJo(p)· Hence ( cf. Lemma 3.7 of Furstenberg and Kifer
[17]) there exists an F-invariant p x n-a.s. measurable subbundle ~ ~ ~
.,1 of E such that .,1 >.,1 max and d (P . .,1 ) > d (P . .,1 max). This con-
tradicts the maximality of .,1 max and finally proves (3.22).
Now we put .,11 = .,1 max. To get the next term in the filtration
(1. 15) we repeat above arguments for .,11 in place of the whole E.
Since .,11 is measurably isomorphic to the correspondir..g direct
product and it is F-invariant p x n-a.s. this will lead in the same
way as above to the construction of .,12 and so on.
Let.,1 be an F-invariant p x n-a.s. measurable subbundle then
it follows from Theorem 2.1 and Remark 2.1 applied to.,1 in place
-129-
of E that there exists an ergodic 11 E rip such that P(p,J!) = "(11).
Taking into account also (3.20) one concludes that the numbers
Pi(P) constructed here are the values which the integmls ,.(11) take
on for different ergodic measures II E rip. This completes the proof
of Theorem 1.2 .•
-130-
Chapter N
Further study of invariant sub bundles and characteristic exponents.
Under mild additional hypothesis we shall be able to prove
the continuity of invariant subbundles construcled in the pre
vious chapter. Then we shall establish conditions providing sta
bility and positivity of the biggest characteristics exponent.
4. L Continuity of invariant subbundles.
It is always important in mathematics to get some regularity
properties of objects under consideration. In this section we shall
establish certain conditions for the continuity of F-invariant
subbundles. The first condition we shall introduce below relies
only on the properties of the Markov chain Xn in the base space
M.
We shall use the notations of the previous chapter. Suppose
that M is a compact metric space and Fi = (fi ,.:7F,), i = 1,2, ... are
independent random bundle maps with a distribution n such that
n-almost surely nFrr-1 is a continuous map of M and .:7F·(x) is a
continuous in x matrix-function with det .:7F (x) T- O. Under these
circumstances we have the following result which was obtained in
a conversation with M. Brin.
Theorem 1.1. Suppose that p E P(M) is a p. -invariant
ergodic measure and for each x the transition probability
P(x,Q) = mU : fx E Q! has a density p(x,y) with respect to some
fixed Borel measure J.L E AM), i.e. P(x, Q) = J p (x ,y )dJ.L(Y), this
-131-
density is continuous in both arguments and p (x ,y) := 0 for
y q supp J.1.. Then any subbundle L = flex)! of E = M x IR ffi which
is F-invariant in the sense of ///3.11 and defined for p-almost all x
is continuous on an open subset of M having p-measure equal to
one.
Remark 1.1. It is clear that instead of compactness of M we
can require the compactness of supp p.
Remark 1.2. The assumption that P(x ,. ) has a density is
natural in Markov processes but it has nothing in common with
the situation one usually encounters in the theory of determinis
tic dynamical systems since in the latter case P(x,· ) is the Dirac
a-measure.
Proof of Theorem 1.1. Since p is p' -invariant i.e.
p( Q) = J P(x, Q)dp(x) for any Borel set Q and P(x,· ) has a con
tinuous density p (x,y) with respect to J.1. then p also has a density
p satisfying
Jfi(x)p(x,y)dm(x) =fi(y) (1. 1)
and so fi is also continuous.
Put Q+ = {x : fi(x) > OJ then Q+ is an open set and p(Q+) = 1.
For any x E Q+ and a Borel set r one has
From (1.1), the definition of Q+ and the continuity of p (x ,y) it fol
lows that
-132-
~l(D) = sup p (x,y)~O as D~O. (1.3) X.yEQ+ &- dist(y.1I\ Q+)s6
On the other hand
inf 'j5(y) = ~2(D) > O. y EQ+ &- dist(y.1I\ Q+)0!:6
( 1.4)
Thus
Now let L = {Lx I be a Borel measurable F-invariant subbundle
of E defined for p-almost all x. Then by some modification of
Lusin's theorem (see Hewitt and Stromberg [19]) one can derive
that there exists a closed set fa on which L = {Lx l is continuous
and
p(fa) > 1-~2(D)/(8sup p(x,y)) x.y
where 15 is chosen to satisfy ~l(D) < (8JL(Q+))-1. Hence by (1.2) -
1 (1.5) we have P(x,Q+ \ fa) ~ "4 and so
3 P(x Xa) ~ - for any x E Q+. 4
( 1.6)
Since we assume that for n-almost all F = (f ,:JF) the map f is
continuous and :JF(x) is a continuous in x matrix-function such
that det :JF(x) oF 0 then n-a.s.
dF(D) '= sup (dist(fx,fy) + 1I:J;l(X)-:J;l(y) II) < 00
dist (x . y )s6
for any 15 > 0 and
-133-
(1. 7)
Choose a sequence en ~ 0 as n ~ "". Then by (1. 7) there exists
another sequence 6n ~ 0 such that
( 1.8)
Denote the set in brackets in (1.8) by Gn . Then (1.6) together
with (1.8) yield that for any two points x ,Y E Q+ there exists
Fz .y E Gn such that f z.yX E rand f z.yY E r where
fz. y =1TFz ,y1T-1 Now let dist(x.y)~6n then
dist (f z,yx,f z.yY) ~ en and from the uniform continuity of L on r it follows that the distance between L(fz,yx) and L(fz.yY) in the
corresponding Grassman manifold does not exceed some number
~(en) ~ 0 as en ~ CD. Since :JF(x)Lz = LIz p x "-a.s. then without
loss of generality we can assume that :JF.~)x )Lf •. vz = Lx and
:J;})y)LI •. vY = £y. By the definition of d F • ..<6n ) this implies that
the distance between Lz and Ly is no more than en + ~(en) which
tends to zero as n ~ "". This gives the continuity of L = ILz 1 for
all x E Q+ and since p( Q+) = 1 the proof is complete .•
Remark 1.3. It is important to understand what properties of
stochastic flows which we shall study in the next chapter depend
only on a diffusion process in question and its generator and are
independent of its specific construction by means of stochastic
differential equations. Theorem 1.1 says that all invariant subbun
dIes are continuous provided the transition probability of the
corresponding Markov processes has density with respect to some
fixed measure. In the case of stochastic flows we can derive that
the transition probability of the diffusion in question has a density
with respect to the Riemannian volume if its generator is elliptic
(see Ichihara and Kunita [21]).
-134-
Next, we shall give another sufficient condition for continuity
of invariant subbundles which is unwieldy a little bit but can be
formulated in terms of random bundle maps and do not involve
transition probabilities. This condition requires that for p-almost
all points x and nt-almost all f a small neighborhood of fx can be
covered by the images of x under the actions of all 1 close to f.
To give the precise statement define the distance between bundle
maps at x by
Theorem 1.2. Suppose that M is a compact metric space,
p E: P(M) is an ergodic p' -invariant measure and L = ! Lx I is an F
invariant in the sense of (III.3.11) Borel measurable sub bundle.
Denote by :Jx the set of F E: "tl' which satisfy (III.3.11) for a fixed
x E: M. Assume that supp n c'Il' is large enough so that for p
almost all x and for n-almost all FE: 'Il' the following is true: for
each ~ > 0 one can find a > 0 (depending on x and F) such that
u !y : y = 1TF1T-1xl :) U6(1TF1T- 1X)
FE.]" and d(F.F)<>:
where Uij(y) ==!z : dist(y,z) < 01. Then the sub bundle L = !Lxl is
continuous on an open set having p-measure one.
Proof. It is easy to see that in our circumstances the sub
bundle L = !Lx I is continuous in a neighborhood V(fx) of fx for
p-almost all x and n-almost all f. Take the union of these neigh
borhoods V = U V(fx). Then L is continuous on V. On the other
hand, P(x. V) = 1 for p-almost all x and so
p( V) = J dp(x )P(x, V) = 1 proving the theorem .•
-135-
4.2. Stability of the biggest exponent.
We have proved in Theorem III.2.1 (see also Remark III.2 .. 1)
that the maximal characteristic exponent f1o(p) corresponding to p
can be represented as
(2.1)
for some ergodic measure vp E: 11p where 7(V) is the integral
(III. 1. 19). Up to this point we considered the distribution n as fixed
and the dependence f1o(p) on n was of no importance for us. Now we
are going to change this point of view and to study the stability
properties of f1o(p) when n is perturbed in the weak sense. We shall
indicate the dependence of po(p)on n by writing f1o(p,n). Similarly,
the integral 7(V) depends on n which we shall express by 7(v,n).
Actually, the existence of the limit (III. 1.17) follows at once
from Theorem 1.2.2. Indeed, if an(x,w) = log Iln.J(x,w)11 then
and so an (x ,w) forms a subadditive process which will be station
ary and ergodic provided we fix some p. -invariant ergodic measure
p E: AM). Now applying Theorem 1.2.2 we obtain another represen
tation of the biggest exponent
f10(p,n)=inf l j jloglln.J(x,w) IIdp(x)dp(w) = (2.2) n n
dp(x )dn(Ft), .. dn(Fn ),
-136-
In what follows we shall talk about
pen) "" sup po(p,n) p<mJ..n)
(2.3)
where men) is the set of all P';-invariant ergodic measures. Here
the notation Pn indicates the dependence on n in (1.2.6) and in
(III. 1.3). If there exists a unique P';-invariant measure then (2.3)
gives the corresponding characteristic exponent. This will be the
case when the transition probabilities Pn(x,·) have positive
bounded away from zero densities with respect to some fixed Borel
measure. Actually, if M is compact and Pn(x,· ) have continuous
densities with respect to some fixed measure then it is easy to see
that the supports of different ergodic measures p are disjoint and
so each po(p,n) can be treated separately.
Denote
IN(n) == sup J p<mJ..n) lI:7r(x) II + 1I.7F1(x) II >N
Suppose that M is compact and 1l' is a topological space of continu
ous vector bundle maps such that the map (u ,F) ~ Fu of IIE x 1r
into fIE is continuous in the product topology of IIE x 1r.
Theorem 2.1. Let Io(n) < 00
nk ., n in the weak sense as k ~ 00 and (2.4)
-137-
(2.5)
Then
limsup f3(n,,) ,,; f3(n) as k -+ 00.
"->00 (2.6)
w Proof. Suppose that PI:;. -+ P where PI:;. is a P~.,-invariant
measure. Then it is easy to see that p is P~-invariant measure.
Denote the last multiple integral in (2.2) by bn (p,n). Then by (2.2),
(2.4) and (2.5) it follows that
= inf 1.. bn (p,n) = f3o(p,n) ,,; f3(n). n n
Since M is compact one concludes from Theorem III.2.1 and
Remark III. 2. 1 that for each nk there exists p; such that
(2.8)
Again, by compactness M for each sequence P;i there exists a con
verging subsequence p; . Acting as in (2.7) we shall get finally the i,
assertion (2.6). The interchange of limits and integrals was legiti
mate in view of the assumption (2.5) .•
Remark 2.1. Some kind of the assumption (2.5) is nec'~ssary
since otherwise some decreasing mass of measures nk can go to
infinity but still making a non-vanishing contribution to integrals
of the type (III. 1. 19).
Remark 2.2. We may consider the dependence of (lo(p,n,,) on
-138-
"k instead of P("k) if p is p';. -invariant for all k. The results and
the proofs will remain essentially the same.
Remark 2.3. There are simple cases when the inequality
(2.6) is strict. This can be achieved already in the case of random
matrices i.e. when M is just one point. Consider, for instance,
A=[~ ~-lla>oandJ=[~ -1) o .
Let "n be the family of the probability measures defined by
1 1 w "k (fA l) = 1 - k and "k OJ l) = k' Clearly, nk -> no where no is con-
centrated in one point A. One the other hand, it is easy to see
that (3(nk) = 0 for all k and (3(no) = Ilog a I (see Kifer [25]).
Next, we shall give some condition providing the equality in
(2.6).
Theorem 2.2. Suppose that the assumptions of Theorem 2.1
are satisfied. Assume that 7(11,") is the same for all "-stationary
II E PeTIE) then
p(nk) -> p(n) as k -> co. (2.9)
Proof. By (2.1) and (2.3) for each k there exists a sequence
IIlk), i = 1,2, ... of nk-stationary measures such that
(2.10)
Since TIE is compact then one can choose a subsequence lI~k)
weakly converging to some lI(k) as j -> "". Notice that lI(k) is also
"k -stationary. Indeed, in the relation
(2.11 )
-139-
for a continuous function p on DE one can pass to limit as i --> 00 to
obtain
(2.12)
since we suppose that all nk are concentrated on continuous vec
tor bundle maps. Now (2.12) says that IN) is nk-stationary.
From the uniform integrability property (2.5) and the inequal
ity (III.2.25) it follows (see, for instance, Neveu [37]) from (2.10) w
and the convergence lI~k) --> lI(k) that
(2.13)
Now suppose that (2.4) holds. Since DE is compact then there
exists a subsequence k e .... 00 such that lI(ke) weakly converge to
some measure 1I. We may pass in (2.12) to the limit over the
subsequence ke to conclude that 1I is an n-stationary measure.
Employing again the uniform integrability property (2.5) together
with the inequality (III.2.25) we obtain
(2.14)
But we suppose that 7(1I,n) is the same for all 1I i.e. 7(v,n) = f3(n)
for any n-stationary 1I. Hence f3(nke) -+ f3(n). Applying this argu
ment to each subsequence fn.\:t! instead of the whole sequence fnk!
we derive (2.9) .•
Remark 2.4. According to Theorem III.1.2 the lack of F
invariant proper subbundles of E implies the condition of Theorem
2.2 here that 7(1I,n) is independent of 1I. It seems natural to
believe that this situation is typical in some sense. In the case of
-140-
random matrices i.e. when M is just one point this was proved in
Kifer [25]. It is easy to see that the set of n having no F-invariant
subbundles in the sense of (III.3.ll) is everywhere dense in the
weak topology of A'ft). Indeed, one can take the convolution of n
with OE E p('ft) where an oE-distributed random variable FE has the
form FE = (id ,OE) with OE being a random matrix uniformly distri
buted on ~ neighborhood of the identity in the group of orthogonal
matrices 00(m). Then, clearly, n * OE corresponds to the compo
sition of independent actions of F having the distribution n and FE
having the distribution OE. Evidently, this composition has no
invariant subbundles.
Remark 2.5. Other cases of stability of the biggest exponent
for products of random matrices the reader can find in Fursten
berg and Kifer [17], Kifer [25] and Kifer and Slud [26].
4.3 Exponential growth rates.
In this section we shall give some conditions which imply the
positivity of the biggest exponent fJo(p). Actually, we shall con
sider the case when there is no F-invariant subbundles and so the
biggest exponent will characterize the growth rates of all vectors.
Thus if fJo(p) >0 then all norms Iln.J(x,c.»~11 will grow exponen
tially. This fact turns out to be important in the study of
Schrodinger operators with random potentials. Our exposition
here follows the arguments of Furstenberg [16]. Recently, similar
results in more general situation were obtained by Ledrappier
[34].
Let IC denote the unique measure on nm - 1 which is invariant
with respect to rotations i.e. with respect to the natural action of
the group 00(m) on rrm - 1. This measure can be obtained from
the Lebesgue measure on the sphere 0 m - 1 by the natural projec
tion 0 m - 1 -> rrm-I
-141-
Lemma 3.1. Let g : IRm ~ IR m be a linear transformation then
II NU "m_ d -1 ~.L !!:9~(c.;) = Idetg I Ilu 11 m dtC
(3.1)
for any u E nm - 1 where g-ltC(f) = tC(g f) and, again, U is a vector
on the line corresponding to u E nm -1.
Proof. If ),. is the Lebesgue measure on IRm then
d -I),. 11 71L d -1 "N~llm-l Idetg 1= !!:fl.--(u) = .L1I:..J.. (!!iL--"£(u) .1l.!L~LL_).
d)" Ilu II dtC Ilu Ilm-l
The term in brackets represents the ratio of the volume on the
sphere of radius II gu II after the transformation g and the volume
on the sphere of radius II u II before the transformation g. The
ratio ':I~ characterizes, of course, the stretching in the
radius-vector direction. The proof is complete .•
The lemma above yields another representation for the maxi
mal exponent.
Lemma 3.2. Suppose that n E p('ft) and p E P(M) are the
same as in Theorem III1.2. Let v E 11p has a desintegration
v = J vzdp(x) such that p-almost aLl measures V z E Anm - I ) are
equivalent to Ie in the sense of absolute continuity (tC -< vz,vz < Ie).
Assume that det :JF(x) oF 0 p X n-a.s. then
1 J J d:JFl(x)vfZ 7(V) =- - log -----(u)dv(x,u)dn(F)
m dvz (3.2)
+.1_ J J log I det .9F(x) I dp(x )dn(F). m
-142-
Proof. By Lemma 3.1,
II.7F(x)iZ II 1 d:p;l(x)/C 1 log ----- =- -log ---- + -log Idet :7F (x) I. lIiZ II m d/C m
In view of the definition (III.l.19) the present lemma will follow if
d:p;l(X)/C d:p;l(x)lIlz. . we can replace ---- by ----- m the mtegral. The
. d/C dllz
difference between the corresponding expressions is
1 d:p;l(x)/C dllz -- Illog---(u) -----(u) dll(x,u)dn(F) m d/C d:p;l(X)lIIZ
1 d.7i1(x)/C = -- I I log --::l--(u) dll(x,u)dn(F) -m d.7F (x)lIlz
1 II d/C - - log--(u)dll(x,u)dn(F) m dllz
= ~ II log dd~(:7F(X)U)dll(x,u)dn(F)-m 1I1z
1 II d/C - - log --(u)dll(x,u) m dllz
1 II d/C = - log --(u)dn • lI(x,u) -m dllz
1 II d/C - - log -d (u)dll(x,u) = 0 m liz
since 11 is n-stationary, where n· II E ATIE) is defined by
n • 1I(r) = IlI(F-lr)n(F) for any Borel r c TIE. Here we have used
the equality
-143-
f fg(Pu)dv(v)dn(F) = f fg(v)dn * v(v) =! !g(v)dv(v)
with d/C
g(v) = log --(u), dvx
g(Pu) = log _d/C_ OF(x)u) .• dVIx
(x,u)=v and
Corollary 3.1. If v E rip satisfies the conditions of Lemma 3.2
then
,(v) > J... f f logldet .YF(x)ldp(x)dn(F) (3.3) m
unLess
p X n-a.s. (3.4)
In particular, if Idet.9F (x)1 "" 1 th8n,(v) > o.
Proof. By Jensen's inequality
f d:1jl(X)v Ix ~ log -----(u)dv(x,u) =
dvx
and equality can only hold if the integrand of the second integral
is constant v-a.s. But in this case .9jl(x)1I Ix = IIx or which is the
same vIx = .9F(x)Vx p-a.s. Hence by (3.2) the only case when
(3.3) can fail is given by (3.4) .•
-144-
Remark 3.1. Since (3o(p) = sup 7(V) then by Corollary 3.1 the ..,En. existence of a measure v satisfying the conditions of Lemma 3.2
implies flo(p) > O.
Next, we shall need the following fact from Furstenberg [16].
Lemma 3.3. If v 1 and v2 are Borel probability measures on
nm - 1 then
- J dV2 < 2-./2 ! - (log -)d vd lf. d V 1
{3.5}
(If V2 is not absolutely continuous with respect to vl then the right
hand side is 00 taking log 0 =- 00).
J dV2 Jdv2 Proof. Notice that (log -d -)dvl ~ log --dvl = 0 by
Vl dVl
Jensen's inequality and so the expression in ( I is non-negative. We
may assume that V2 -< Vl' Then by Schwartz's inequality
-145-
Next, by Jensen's inequality
Hence
provided - J(log ddll~)d1l1 ~ -21 since e-a ~ 1-2a when 0 ~ a ~ 1... 111 2
On the other hand, in any case, II 11 1-11211 ~ 2 and so when
J d 112 1 - (log--)d1l1 > - we also obtain (3.5).
d 111 2
Theorem 3.1. In the circumstances and notations of
Theorem III1.2 suppose that n-almost surely f = 1TF1T-1 is a
homeomorphism, :JF is a continuous @(b(m)-valuedfunction on a
compact space M, f p « p and the density g f (x) = df p((~) is con-dp x
tinuous. Assume that there exists no measure 11 E: 17p for which
(3.4) holds. Then
fJo(p) > J_ J J log I det :JF(x) I dp(x )dn(F). (3.6) m
Jnparticular, if I det :JF(x) I = 1 P x n-a.s. then fJo(p) > O.
Remark 3.2. According to (IILl.ll) the integral in the right
-146-
hand side of (3.6) represents the sum of all characteristic
exponents. So if this sum equals zero (d. Theorem Y.1.3) then
fJo(p) > O. If f is a random diffeomorphism with a distribution m
and det Df = 1 m-a.s. then f preserves the Riemannian volume and
so the condition f p -< p is satisfied. Hence we get the assertion of
Theorem 3.1 which in this case was stated independently by Car
verhill.
Proof of Theorem 3.1. For each n consider a sequence of
independent random bundle maps E/n ) = (id, uin )) where id is the
identity map on M and U/n ) is a l!Il1(m)-valued random variable
with a distribution A". having a positive density with respect to
some Riemannian volume on l!Il1(m) compatible with the natural
smooth structure there. We suppose that An weakly converges as
n ~ 00 to the measure concentrated at the identity matrix I of
CL(m),
(3.7)
and
sup J (log+ II u II n II ull+11 [rIll> N
(3.8)
One can choose bundle maps !Ein ) . i = 1, ... l to be independent
of the initial sequence of random bundle maps F 1 ,F2 , ... with the
distribution n.
Next, let "k be the distribution of the composition Efk l 0 Fl
i.e.,
-147-
J rp(F)dnk (F) = J rp(f, U:J F)dn(F)d'-n (U)
for any Borel function rp on 1!'. Since the action of E/Ic ) Q Fi on the
base M is the same as for Fi itself then all distributions nk gen
erate Markov processes on M with the same transition probability
P(x,·). Hence for all of them the measure p is p' -invariant and
ergodic. Let v(k) be a nk-stationary measure with 1TV(k) = p. It is
easy to see that in the corresponding desintegrations
v(k) = J vik) dp(x) p-almost allvik ) E Anm - 1) are equivalent to /c.
By taking some subsequence we may assume that v(k) -+ v as
k -+ DO since the space of measures (v E AM x nm-I), 1TV = pI is w
compact. From the construction nk -+n and so nk ~ v(k) -+ n ~ v.
From this it follows that v is n-stationary.
Notice that
(3.9)
Indeed, applying the formula (2.2) and taking into account (3.7)
and (3.8) we can use the same arguments as in (2.7) to obtain
(3.9) .
In view of (3.9) it suffices to prove that
limsup (:JO(p,nk) > L J f log I det :JF(x) I dp(x )dn(F). (3.10) k-+oo m
Since p-almost all v~k) are equivalent to /C then by Lemma 3.2
the inequality (3.10) will follow if
-148-
dJjl(X)V!X limsup!(-J J----(u)dv(lc)(x,u)d"k(F)l > O. (3.11)
k->~ dvx
But, by Lemma 3.3 this will be the case unless
(3.12)
If (3.12) is true then there is a subsequence k i such that
(3.13)
Suppose that the topology on 1l' is given by the metric
(3.14)
w where d X is defined by (l.9). Since"k ~"it is easy to see that any
neighborhood U of each F E supp " has "k; -measure bigger than
some dF, U) > 0 provided i is large enough. This together with
(3.13) yields that for p-almost all x there exists a sequence Fi ~ F
such that
Now let ¢(x ,u) be a continuous function on M x rrm - 1 with I ¢ I ~ l.
Then
(3.16)
-149-
Since Fi ~ F in the metric (3.14) and ifJ is uniformly continuous in
u we see that the first term in the right hand side of (3.16) con
verges to zero as i ~ 00. By (3.15), the second term there con
verges to zero, as well. Hence,
(3.17)
Since :JF(x) depends continuously on x then integrating (3.17) w
with respect to p and taking into account that v(k) -> v we obtain
Since g j (x) = :!:1..PJ:.(:U.) is continuous then dp x
= J JifJ(x,u)dp(x)dvjx(u).
Now (3.18) and (3.19) give
~3.19)
-150-
J t/J(x ,u )dp(x) d:JF(x ) liz (U) = J J t/J(X ,U )dp(x)d 1I1z (u) (3.20)
which holds for any continuous function t/J. From the uniqueness of
the desintegration (see Bourbaki [8], ch. 6 § 3 no. 1) we conclude
that
(3.21)
Since F E supp n is arbitrary then we get (3.4) in contradiction to
the assumption of Theorem 3.1. Therefore (3.11) is true which
implies (3.10) yielding (3.6) in view of (3.9) .•
Until (3.19) we did not need the assumption that f p -< p and
~:- is continuous. We can derive (3.4) from (3.12) under other
conditions, as well.
Theorem 3.2. Suppose that all assumptions of Theorem 3.1
except for f p -< p are satisfied. Let A,z; = fy : there exist a number
f1.=f1.(y) and Fl, ... ,FeESUppn such that
rrFl a ... a Ferr-lx = y!. If
p{x : Az is measurable and p(Ax) > O! > 0 (3.22)
then (3.12) yields (3.4) and so the conclusion of Theorem 3.1
remains true.
Proof. First, notice that fAx C Ax provided F E supp nand
f = rrFrr- 1 Since p is ergodic this implies that either p(Az ) = 1 or
p(Ax) = o. By (3.13) and (3.22) one can choose Xo E M such that
p(Azo) = 1 and for x = Xo (3.13) holds true. Since rrm - l is compact
(kj )
there is a subsequence kit such that IIzo j weakly converges to
C1 (k j )
some measure I/xo. Then also JF(x )).Ixo j weakly converges to
-151-
:JF(x )vzo. Now we can employ (3.13) to conclude that
(3.23)
Notice that the sequence kij does not depend on F and so (3.23)
also says that :JF(xo)vzo actually depends only on f = nFn-1. This
enables us to denote
(3.24)
Since p(Azo) = 1 we conclude that for p-almost all y and each
FE supp n,
(3.25)
Then also
(Ie:,) (Ie!) W II 1 = J lIy i dp(y) 4 Jvy dp(y). (3.26)
On the other hand 1I(1e,) 4 II and so by the uniqueness of the desin
tegration lIy = lIy p-a.s. which together with the second equality
in (3.25) gives (3.4) .•
The assumption of Theorem 3.2 is not very elegant. We shall
give a sufficient condition for it.
Corollary 3.1. Suppose that all transition probabilities
P(x,· ) have densities p (x ,y) with respect to some fixed measure m
on Y. Then {3.22} is satisfied and so the conclusion of Theorem 3.1
is also true.
-152-
Proof. Let, on the contrary, peAl,) = 0 for p-almost all x.
Then for p-almost all x the measures P(x,· ) are singular with p.
Since
per) = Jdp(x) p(x,n = J dp(x)p(x,y)Jdm(y) M r
(3.27)
then p ~ m. Let f5(y) = ~(y) then
f5(y) > 0 implies p(x,y) = 0 (3.28)
for p-almost all y since p and P(x,· ) are singular. But
f5(y) = Jf5(x)p(x,y)dm(x) (3.29)
and so (3.28) is impossible for p-almost all x. This contradiction
proves the assertion .•
Remark 3.3. One can formulate certain conditions which
assure the non-existence of a measure 11 satisfying (3.4) and there
fore the positivity of f3o(p). These conditions may be based on the
simple fact that if 1I1,!12 E Arrm - 1), J.L E Pc(0r1(m)) or
J.L E p(@r1(m)) and g 111 = 112 for walmost all g then supp J.L can not
be too large, in particular, it has no interior.
It is not easy to check that there is no 11 E nT} satisfying (3.4). It
is worthwhile to have more straightforward assumptions yielding
(3.6). Consider the Markov chain Zn == nFZo on E = M x If.m with
the transition probability Q((x,~),r) = n(F: F(x,~) E r). Then the
Markov chain Yn with the transition probability R(v,·) defined by
(1.12) describes the evolution of directions of Zn. Using the nota-v
tions of Section 3.3 we can write Yn = Zn·
-153-
Theorem 3.3. Suppose that the conditions of Theorem III. 1.2
are satisfied and, in addition, n-almost surely f :::: 1TF1T-1 is a
homeomorphism and 'JF is a continuous @lb(m)-valued function
on a compact space M. Assume that the transition probability
Q((x ,~),.) has a continuous in both arguments density
q((x,n(!ll)) with respect to some measure 'j1EP(MXlRm ), such
that q ((x ,~),(x,1')) equals zero if (x,1') rt. supp 'j1 and it is positive
when (x ,~) and (x,1') belong to some neighborhood of M x 0 m - 1
where 0 m - 1 is the unit (m-1)-dimensional sphere centered at the
origin of IRm. Then f3o(p) > 0 for any p' -invariant ergodic p E P(M)
provided
I det .YF(x) I :::: 1 p x n-a.s.
Proof. Since Zn has a continuous transition density with v
respect to 'j1 then the process Yn :::: Zn also has a continuous tran-
sition density r(w,w) with respect to a measure j..L E P(TIE) defined v
by j..L( V) :::: 'j1!(x ,~) : (x ,~) E VI i.e. J.L is obtained from 'j1 by the
natural projection of E on TIE. Moreover the Markov chain
Xn :::: 1TYn on M with the transition probability P(x ,.) also have con
tinuous transition density p (x ,y) with respect to the measure 1Tj..L,
where 1T : TIE -> M is the natural projection. One argument proves
both statements above and we shall demonstrate it for the second
case only. Notice that R((X,U),1T-1C):::: P(x,C) for any Borel C c M,
where P(x,·) is the transition probability of the Markov chain
Xn :::: 1TYn · According to the desintegration theorem (Bourbaki [8],
ch. 6 §3 n.lo Theorem 1) the measure j..L has a representation
J.L = J J.Lz d 7rJ.L{x). It follows from above that the integral M
p(x,y):::: J r((x,u),(y,v))dj..Ly(v) D",-l
(3.30)
is independent of u. It is easy to see that p (x ,y) is the density of
-154-
P{x,·) with respect to the measure 1TjJ-. Clearly, p (x ,y) is continu
ous in both arguments.
Next, we are going to prove that there exists no measure
v E 'J1p which satisfies (3.4). This would imply Theorem 3.3 by
means of Theorem 3.1. Let, on the contrary, such v exists and has
a desintegration v = J vxdp(x). In the same way as in Theorem II
1.1 we have proved the continuity of F-invariant subbundles, one
shows that Vx must depend on x continuously in the weak topol
ogy. To do this one introduces a metric d on Arrm - l ) compatible
with the topology of weak convergence on Arrm - l ) by taking a
countable dense set of continuous functions rpi on M and setting
d (v(l) ,v(2)) = E 2-£ If rpi d vOL J rpid v(2) I. i
(3.31)
Other steps of the proof are the same as in Theorem 1.1. This
leads to the conclusion that Vx depends continuously on x.
Since the density q is positive on some neighborhood of
M x @m-l then both densities rand p are positive on rrE and M,
respectively. Thus for each x E M;u ,v E rrm - l one can choose a
sequence of vector bundle maps Jii-n) E supp n such that
(3.32)
yCn) ~ F E supp n as n ~ 00 (3.33)
and
(3.34)
By the continuity of the measures Vx in x we conclude from
(3.32)-(3.34) that
-155-
(3.35)
Consider ex = !F = (f ,'JF) E: SUpp n: fx = x and 'JFvx = l/x l then
by (3.35)
(3.36)
for any u E: TIm-IOn the other hand we shall show that ex is non
compact and so one can find two linear subvarieties VI' V2 C TIm-l
and a sequence F(n) E: ex such that :Jpn)V ~ V2 if v (/. Vl ( cf.
Furstenberg [16J, p. 427). This implies that lIx must be concen
trated on Vl U V2which contradicts (3.36). So it remains to prove
that ex is non-compact. But since the transition density q of the
Markov chain Zn is positive on a neighborhood of M x 0 m - l then
there exists a sequence Fn E: supp n satisfying (3.32)-(3.34) with
v = u and u is the eigendirection of :JF with a real eigenvalue
bigger than one. Thus the powers !:J~, k = 1,2, ... l cannot belong
to a compact group and so ex is non-compact. •
-156-
V. Smooth random transformalions.
In this chapter we shall discuss some applications of general
results considered in previous parts of this book to the case of ran
dom diffeomorphisms and stochastic flows.
5.1 Random ditfeomorphisIDS.
We shall talk in this section mainly about diffeomorphisms but
all results remain true for smooth maps with non-degenerate
differentials Le. local diffeomorphisms. Let M be a compact m
dimensional Riemannian manifold and ttl be a Borel probability
measure on the space Jf' = .2lCM) of (1 -class diffeomorphisms of
M considered with the topology of (Lconvergence. This topology is
given by the met.ric
d(f ,g) = sup(dist(fx,gx) + IIDf -Dg IIx) (1.1) x
with
II D f - Dg II = su p Jl.Qikl!!l.3:lL x O .. tET.M II f II (1.2)
where Df and Dg are differentials of f and g, respectively, acting
on the tangent bundle TM of M and Tx M denotes the tangent space
at x. We shall use the notation 't!' for the space of differentials as
bundle maps TM ... TM. Still rem3.rk that the difference between Jf' and 't!' is not of great importance here since over each
diffeomorphism f there exists exactly one bundle map Df .
-157-
Clearly. the maps (f .x) ~ Jx and (DJ .f) ~ DJ f of Jf x Minto
M and of 'Ir x TM into TM are continuous in 1C1-topology and so the
measurability conditions which enables us to consider indepen
dent random diffeomorphisms Hi! having a distribution m E p(Jf) are satisfied. Moreover. according to Lemma 1.2.2 in this situation
there exists at least one p' -invariant probability measure.
We already obtained certain results concerning smooth ran
dom maps in Theorems 1.3.3 and II.2.4. In order to apply Theorems
III.1.1 and III. 1.2 to random diffeomorphisms we have to explain the
product structure of TM. By the definition (see Hirsch [20]) the
tangent bundle TM has a locally product structure. This means
that there exists a cover of M by a finite number of open subsets
Ui c M. i = 1 ..... e called charts which are diffeomorphic to the
unit ball in IR ffi and the tangent bundle TM restricted to Ui can be
identified wilh Ui x 4. where 4. is linearly isomorphic to IRffi. Con-e
siderUt=Ut '\ U Uj fori=1. .... e-1andUe =Ue· Then!Uil j=i+l
are disjoint and they cover M. In each 4. one can choose m
linearly independent vectors [;/. j = 1 ..... m. Now we can define
m li.nearly independent vector fields [;j assuming that [;j = t;! over
Ui. This enables us to transform TM to the direct product M x IR ffi
according to the following rule. Take a basis !1]j l of IR ffi then any
point (x .t) E TM with x E Ui corresponds to a point (x .1]) E' M x IR ffi
such that t has the same coordinates with respect Lo Lhe basis It!l
as 1] has with respect to !~ I· These together with the remark that
differentials act linearly on tangent spaces which are the fibres of
the tangent bundle lead to the set up of Section 3.1.
As in the case of random bundle maps a random
diffeomorphism f with the distribution m generates a Markov chain
Xn with the transition probability P(x .. ) given by 02.6). The
differential DJ acting on TM induces the natural action of DJ on
the projective tangent bundle IITM where any two non zer0 vectors \
-158-
{ and ~ are identified if they belong to the same tangent space and
~ = const t For any measure v on P(ITTM) one defines
an ,. v E AITTM) by the equality
f rpdan ,. v = f f rp(Dfw )dtn(f )dv( w) which holds for any Borel
function rp on I1TM. Next, v is called tn-stationary if tn ,. V = v.
Using the operator p' constructed by means of the transition pro
bability P(x " ) as in (1.2.9) we introduce the notion of p' -invariant
measures, as well. Now Theorems IlL 1.1 and IIL1.2 can be reformu
lated under these circumstances without any alterations.
Theorem 1.1. Let f l ,f2, . .. be a sequence of independent
random diffeomorphisms with the common distribution tn satisfy
ing
f flog+IIDf Ilx dtn(f)dp(x) < DO
where p E AM) is a p' -invariant ergodic measure. Then for p x p
almost all (x ,c.J) there exist a sequence of linear s'ubspaces of the
tangent space TxM at x,
o C V{x,Q) C ... C V?x,,,,) = TxM and a sequence of values
such that if ~ E Vix,,,,) " Vi:.1) , where Vix.c.» "" 0 for aLL i > s (p) ..
then
lim llog IIDnf~1 = ai(p) n -+00 n
(1.3)
where Dn f = Dfn a ... a Dfl and
-159-
The numbers m i (p) = dim V(X . ..,) - dim v(;-.~) are p x p-a.s. con
stants and they are called the multiplicities of characteristic
exponents a i (p). Furthermore
1 s~).. lim -logldet Dnfl = L; mt(p)at(p). n -+00 n i=O
{l.4}
Let 0 1 (A) ~ 02(A) ~ ... ~ om (A) > 0 denote the diagonal elements of
the diagonal matrix /::,. which emerges in a decomposition of an
m xm-matrix A into the product A = Kl /::,. K2 where Kl and K2 are
unitary matrices. Then
k wherej =min!k L me(p)~il.
e=o
(1.5)
Although we did not mention an assertion similar to (1.5) in
the statement of Theorem III.Ll it became a standard ingredient
of the multiplicative ergodic theorem (see Ledrappier [33] and
Ruelle [43]).
Theorem 1.2. Suppose that in the conditions of Theorem 1.1.,
Then one can choose a Borel set Mp C M with p(Mp) = 1 such that
for any x E Mp there exists a sequence of Linear subspaces of the
tangent space Tx M at x,
-160-
and a sequence of values
-00 < f3r(p) < ... < f31(P) < f30(p) < 00
such that if ~ E J~ \. J~+l, where J ~ == 0 for i > r(p), then
and
The numbers f3i (p) are the values which the integrals
)'(v) = J J log lH?ljUL dtnU )dv(u) lIuli
(1.7)
(1.8)
(1.9)
(1.10)
take on for different tn-stationary ergodic measures v E P(rlTM)
satisfying 1TV = P where u is the element of TM corresponding to u
from TITM.
The following resul t was proved in the case of stochastic flows
by Baxendale [5].
Theorem 1.3. Let a p' -invariant ergodic measure p E AM)
has a density q with respect to the Riemannian volume m on M. If
a'i (p) are the characteristic exponents of a sequence of in de pen
dent random diffeomorphisms fl,f2' ... given by Theorem 1.1 and
m i (p) are their multiplicities then
-161-
L: mi(p)ai(p) ~ 0 i
(1.11)
and the equality holds if and only if p is f-invariant in the sense
of (If 1.23) i.e. p(f -lr) = per) tn-a.s. for each Borel reM.
then
Proof. Since for any diffeomorphism f,
f -lp(r) = p(f r) = f q (x )dm (x) Ir
f q(fy) \det Df (y) \dm(y) = 1. II
Thus by Jensen's inequality
(1.13)
0= log it q(fy)\detDf(y)\dm(y)dtn(f) (1.14)
~ it loge 9..!;f:f I det Df (y) I)q (y )dm (y )dtn(f)
= it log \detDf(y)\dp(y)dm(j) +
+ f r log q (fy )dp(y )dn(f) - flog q (y )dp(y) lit' II
and the equality in (1.14) holds if and only if
9..!/Gf Idet Df (y) I == canst pXtn-a.s. (1.15)
-162-
In view of (1.12) the relation (1.15) is equivalent to
f -lp(I') = p(r) m-a.s. for each Borel reM which says that p is f
invariant.
Since p is p' -invariant then
J r log q (fy )dp(y )dn(f)= J log q (y )dp(y). M~ M
(1.16)
Besides, by (1.5) and the ergodic theorem,
Finally, (1.14), (1.16) and (1.17) imply (1.11) and the equality in
(1.11) holds if and only if (1.15) is satisfied which is equivalent to
the f-invariance of p .•
In the smooth situation the action on the tangent bundle is
determined by the action on the manifold itself. This provides cer
tain connections between the metric entropy and characteristic
exponents of random diffeomorphisms. We shall modify the proof
of the deterministic Margulis-Ruelle inequality from Ledrappier
[37], Ch. II, Theorem 2.2 to obtain the following result.
Theorem 1.4. In the circumstances of Theorem 1.1 one has
(118)
Proof. If supp m is compact in [i-topology then we may fol
low the proof of Theorem 2.2 from [37] almost verbatim. Since, in
general, supp In is not compact we shall need some alteralions.
Let l; > 0 be small enough. We shall write c.J E Ok, k ~ 1 if
d (x ,y) s; ~ implies
-163-
(1.19)
where Exp is the exponential map, and
(1. 20)
Put Ok = Ok '\ Ok-I' k = 1,2, . .. then fOk' k = 1,2, ... I is the
countable partition of 0.
Let E~ be a maximal }-separated set i.e. a maximal set with
the distance between any pair of its points more than :. Define a
parlilion a: = fa:(x), x E E~l of M such that for each x E E~,
a:(x) is contained in the closure of its interior, and the interior of
a:(x) is the set of all y salisfying dist(y,x) <d(y,xi) if
x 0# xi E E~. Denote
a,,= !a:xOk , k = 1,2,"'1
then a" is the countable partition of M x 0. The reader can easily
check that the theory of Section 2.1 remains the same if we con
sider countable partitions in place of finite partitions. This will
lead to the same entropy and we shall use here the corresponding
results from Section 2.1 as if they were proved for countable parti
tions. By Theorem 11.1.4 and Corollary II. 1.2 (i) and (iii) one has
(1.21)
By Theorem 11.1.1,
( 1.22)
-164-
since by Lemma n. 1.2 (i), (v) and (ix),
e-2 + Hpl<p( T-(e-l)n a" I \1 :7( T-in a,,)v(MxBo ))
~=O
Next, by Lemma 11.1.1 in the same way as in Corollary IIL1. 1,
(1.23)
=-r,J E p(a:(x)) ~ p(nf-1a:(y)la;:(x))) k Otz£E~ y£E~
where N: ,n is the number of elements of the parti lion n f-1a:
which intersect a;:(x).
By (1.19) and the maximality of E~ for any'" E Ok one has
(1.24 )
-165-
where B(Q,o) denotes the o-neighborhood of a set Q.
Remark that for any mxm-matrix A the number of disjoint
balls of radius 7/2 which can intersect 8(/l. (B(O,-y)),27) does not
exceed C 11 max(oi(A),l) where C depends only on the dimenl,,:;i,,:;m
sian m. Indeed, if A = Kl I::. K2 is the decomposition of A with uni
tary K 1,K2 and the diagonal I::. then the number in question will be
the same for A and 1::.. Clearly, this number is independent of 7,
and taking 7 = 2 we conclude that this number does not exceed
the volume of the parallelepiped with the sides 4(Oi +3),
i = 1, ... , m, divided by the volume of the unit ball in IRm. These
imply our estimate.
By (124) if Col EO: Ok, Y EO: E~ and if a:(y) intersects nf(w)a:(x)
then B(y , 2ek) intersect.s
e By the above remark the number of such balls B(y, 2k) does not
exceed C IT max(oi(D",nf(w)),l) since these balls are disjoint for l,,:;i,,:;m
different y EO: E~. Thus
N:,n~ C IT max(oi(D",nf(w)),l). l,,:;~,,:;m
13y (1.20) and (1.23) it follows from here that
+ J J log+(Oi(Dynf(c.J))dp(y)dp(c.J). o II
( 1.25)
( 1.26)
-166-
This together with (1.21) and (1.22) give
( 1.27)
Letting n ~." and taking into account (1.5) we obtain (1.18) .•
Remark 1.1. Theorem 1.4 says, in particular, that if hp(f) > 0
then the biggest characteristic exponent is positive. If m-almost
all diffeomorphisms preserve the same smooth measure then a
modification of Mane's proof [35] of the Pesin formula will give the
equality between the melric ent.ropy and the sum of positive
charact.eristic exponents for a random diffeomorphism.
On the Riemannian manifold M there is one special probability
measure which is the normalized Riemannian volume m. Then m
is a quasi-invariant with respect to each diffeomorphism f i.e. the
density
(1.28)
exists and, moreover, it is continuous. Here Df y is the restriction
of the differential Df on Ty Y, and some orthonormal bases are
fixed in both T 1-1", M and Tx M. If there exists no measure v satisfy
ing (N.3A) we can apply Theorem N.3.4 to obtain the inequality
(N.3.6). If, in addition, m-almost all f preserve the Riemannian
volume m i.e., Idet Dfx I = 1 m-a.s. then fJo(p) > o.
The quasi-in variance of the measure m will enable us to prove
the following ergodic theorem. We shall use again the notations of
Section 1.2: 0 = 1r-, p = m-, ~ is the shift, T is the skew product
-167-
operator and k f = fk 0 ••• 0 fl where !Ii j are independent random
diffeomorphisms with the distribution m.
Theorem 1.5. Define
C = Ix E: M: :f; Idet D kfx I = co p-a.s.j. (1.29) k=l
Then for any function g E: 111(M,m) and m-almost all x E: Cone
has
n L: g(kfx)/det D kfx /
lim ,,-k_=l::...-_ =g(x) p-a.s. (1. 3D} n ... ""
where 9 is a function defined on C,
go f = 9 mXm-a.s. and J gdm = J gdm. c c {1.31}
Furthermore, for m-almost all x E: M \ C both the numerator and
the denominator in {I.30} tend to some limits p-a.s.
Proof. Put
( 1.32)
then
( 1.33)
Consider the operator
-168-
acting on functions h from (1,1(M x O,m x p). Then
= J Jlh (y ,11c.J) I dm(y)dp(c.J) = Jlh I dm x p
since IdetDfxldm(x) =df-1m (x). Hence V preserves the norm
in l!}(M x O,m x p). Since V is also positive it is a sub-Markov
operator (see Neveu [37], Section V. 4). Remark also that
n n L Idet D kfx(c.J) I = L Vkg(x,c.J) ( 1.36) i=l k =1
and
n n L g(kf(c..J)x) IdetLJ kfx(c.J) I = L yk g(x,c.J). ( 1.37)
k=l k=l
Next, we intend to employ the Chacon-Ornstein theorem
(see, Neveu [37], Section V. 6) but before doing this we must
specify the notion of invariant sets. First, consider
-C=!(x,c.J): L Okf(c.l)(x)==l k=!
then by (1.33),
- ~ E Okf(c.l)(x) = Of,(G»(x)(l + L okf("c.l)(f1(c.J)x)) k=l k=l
This implies that
-169-
Xc 0 T = XC' (1.39)
The invariant sets are defined in Proposition V.5.2 of [37] as the
subsets of C having the form
-Gil. = !(x,w) : L: Vkh(x,w) = 001 ( 1.40) k=l
for di.fferent functions h E: 111(M x n,m x p). From (1.38) one can
easily see that a set Ace is invariant if and only if
XA 0 T = XA' (t.41)
Indeed,
f; vth(x,w) = Vh(x,W)+Of1(Gl)(x)f;vth 0 T(X,W) k=l k'l
and so
On the other hand, if Ace and (1.41) is true then
f: vtXA(x,W) = XA(x,w) f: Otf(c.»(x) k=l k=l
and so Gx.. = A.
Now we can apply the Chacon-Ornstein theorem which asserts
in our case that for any function h E: 111(M x n,m x p) and
m x p-almost all (x,w) E: C,
-170-
n I: h 0 ,-k(x ,c.J)dkf(",)(x)
lim Ie ==-1 ____ . =h(x,c.J) (1.42)
n-ooo
where
hOT = h m xp-a.s. and J h dm = J hdm. C C
( 1.43)
The reader can review the proofs of Lemma 1.2.2, Theorem
1.2.1 and Corollary 1.2.1 to conclude that they go through in the
case of a quasi- invariant measure, as well. Proposition 1.2.1 is also
true (see Kifer and Pirogov [24]). This enables us to derive from
(1.39), (1.42), (1.43) and the definitions of C and C that Xc =Xcxo
m xp-a. s. i.e. the symmetric difference between C and C x n has
m xp- measure zero and h in (1.42) depends only on x p-a.s. If h
itself depends only on x then we obtain the assertions (1.30) and
(1. 31) of Theorem 1.5. The remaining part of Theorem 1.5
follows from Proposition V.6.4 in Neveu [37]. •
Remark 1.2. Theorem 1.5 can be easily extended to the
case of general random transformations with a quasi-invariant
measure. Actually, one needs only that lhere exisls a p' -quasi
invariant measure m i.e. p' m ~ m then it follows similarly to
Lemma 1.2.2 that m x p is T-quasi-invariant.
Another issue we are going to discuss in this section is a ver
sion of the stable manifold theorem. We shall assume that m is
concentrated on the space Jr = .2l1+"(M)
diffeomorphisms i. e. diffeomorphisms whose differentials are
Holder continuous with an exponent iJ > O. Let TJ be a p'
invariant probabilily measure. Since M is compact then by Propo
siLion 1.2.1 TJ has an ergodic decomposition and so by (III. 1.9) the
characteristic exponents at (x ,c.J) from Theorem II 1. 1. 1 depend p-
-171-
a.s. only on x which we express by writing ai(x). From
Theorem 5.1 of Ruelle [43] one can derive the following result.
Theorem 1.6. Let
J log+IIDf Ilx,,j d7J(x)dnt(f) < 00 (1.44)
where II Df II x ,,j is the corresponding Holder norm of the
differential at x. Suppose that A < 0 is diffenmt from all charac
teristic exponents !ai(x)! at x and all of them are bigger than
Then there exist some measurable functions
q(x,GJ) > r(x,GJ) > 0 such thatfor7J x p-almost all (x,GJ) the set
v&"GJ)(r(x ,GJ)) = !y E B{x ,r(x ,GJ)): (1.45)
is a [1+1J submanifold of B(x,r(x,GJ)) (called the stable manifold
at x) tangent at x to V&"GJ) = U! V(x,GJ) : a~ ~ Al where
B{x,o) =!y: dist(y,x) ~ol.
The proof of this theorem can be obtained by adapting to our
situation the arguments of Sections 5 and 6 from Ruelle [43].
Some details can be found in Carverhill [91.
Notice that Theorem 1.6 claims the existence of a submanlfoid
vt",GJ) depending on w. We have seen in Theorems III. 2.1 and III. 2.2
that in certain situations V&',GJ) may have a non-random pflrt. In
these circumstances it IS natural to have a non-random stable
manifold tangent to VA. This question was studied recently by
Erin and Kifer. Let LA = !LxXI be the maximal non-random sub
manifold of p-almost all Vt",GJ) in the sense of Theorem III.2.2. If LX
is continuous then Erin and Kifer have proved that there exists a
-172-
non-random subbundle w; tangent to Lx>" such that p-almost every
intersection w; n v(x ,OJ) contains an open neighborhood of x in w; i.e. p-a.s. v(x,OJ) contains a piece of a non-random submanifold.
Remark 1.3. If dist (nfx ,nfy) --> 0 as n --> = p-a.s. then for any
continuous function g on M,
1 n lim E (g(nfx) -g(nfy)) = O. n->~ n +1 k=O
1 n Thus if {j = lim Ego nf, which exists 7} x p-a.s. by the
n-+~ n +1 k=O
ergodic theorem, then {j must be constant along the stable mani
folds. This remark was essential in Anosov and Sinai's proof [3J of
ergodicity of Anosov diffeormorphisms preserving a smooth meas
ure. Similarly, if random diffeomorphisms have an invariant solu
tion of stable manifolds satisfying certain conditions then one can
prove ergodicity of a smooth p' -invariant measure (provided it
exists).
Nolice that in Theorems 1,1, 1.2, 1.4 and 1.5 we did nol actually
need f was a diffeomorphism tn-a.s. The application of Theorem
III. 2. 1 requires only that Df is regular at p-almost all x i.e. it maps
TxM on lhe whole TfxM. Then if
one obtains the assertion of Theorem 1.2. Here (Df )-1 means
the inverse to Df which exists if Df is regular whether c,r not f
is onc- to-one.
The calculations of characteristic exponents become espe
cially simple in one dimensional case i.e. when In is concen
trated on the space of smooth maps of the circle 0. In this case
-173-
Df is just the derivative and if p is a p' -invariant ergodic meas
ure then by the ergodic theorem (Corollary 1.2.2)
( 1.47)
= lim 1.. t log I Dfk (k-1fx) I n __ oo n k=1
= J J log I Df (x) I dp(x )dttt(f) pXp-a.s.
So as soon as a p. -invariant measure is specified we can obtain
the characteristic exponent f3(p) by (1.47) as an integral.
Example 1.1. Let ttl has a mass p ~ 0 on the non-random
map f (z) = z2 of the circle 0 and the remaining mass 1 - P is
distributed on the set of rotations f 9'(z) = eilpz of 0. Since
IDf 9'1 == 1 and IDf I = 2 we see from (1. 27) that f3(p) = p log 2
where p is the Lebesgue measure.
Remark 1.4. Clearly, if ttt Is concentrated on rotations of 0 then the corresponding characteristic exponent Is zero. This
together with Example 1.1.1 and Example 1.1 produce represen
tations of the same family of transition probabilities by
means of different random transformations with. different
characteristic exponents.
Example 1.2. Let f be a diffeomorphism of Shaving
exactly two fixed points 0 1 and O2 such that I Df (0 1) I > 1 and
I Df (02) I < 1 i.e. 0 1 is a source and O2 is a ::link. Consider a ran
dom diffeomorphism f" given by fe x = fz with probability p ~ 0
and fe z = e i9'·z with probability 1-p where 'fie is a random vari
able uniformly distributed on [-e,e). It is easy to see that in this
situation there is unique P;-invariant measure which converges
-174-
weakly as l: -> 0 to the point measure 002 concentrated at 02.
This can be proved by the same arguments as in Section 4 of Kifer
and Slud [26] comparing the times which t.he process spends
near 01 and near O2. Now by (1.47) we can see that
P(p,J -, log I Df (°2 ) I as l:'" o. The question about stable mani
folds is even simpler here. lndeed, the st.able manifold for f of
each point on 0 except for 0 1 is 0 \ ° 1. But all rotations are
isometries and so they do not change the distances. Hence the
stable manifolds defined by (1.45) will remain the same as for the
det.erministic transformation f. A similar example concern
ing one-dimensional stochastic flows ("noisy North-South flow")
was considered by Carverhill [9].
Our next example is the multidimensional analog of Example
1.1.
Example 1.3. Let A be an automorphism of the m
dimensional torus IjJn i.e. it can be represented by a matrix (aij)
with aij being integers and del A = 1. Suppose that ttl has a
mass p > 0 at A and the remaining mass l-p is distributed on
the set of rotations of IjJn. Since Df "" (aij), all rotations are
isometries and they induce the identity transformation of the
tangent bundle then the characteristic exponents, the subbun
dies J:i from (1.7) and the stable manifolds of the corresponding
random diffeomorphism will be exactly the same as for f. The
characteristic exponents of f are the numbers log I A.t I where
Ai are eigenvalues of (aij ). The subspaces of~ are spanned by the
corresponding eigendirections of (aij). The stabJe manifolds vA defined by (1.45) are the linear span of eigenspaces
corresponding to At with log I At I ~ A.
-175-
5.2 Stochastic flows.
We shall start this section with the notion of a general continu
ous stochastic flow and then we shall pass to stochastic flows gen
erated by stochastic differential equations. This last topic is dis
cussed in many recent papers from different points of view. We
shall not pretend to give the full bibliography on the subject and
we shall not discuss the questions of priority. So the main reason
for specific references will be the convenience of the reader.
Let M be a Polish space, IR+ == [0,00) and f : IR+ X M -7 M be a ran
dom map defined on some probability space (O,p) such that
f(t ,x) = ft x is continuous in (t ,x) p-a.s. If Jf is the space of con
tinuous maps of Minto M with some fixed measurable structure
such that the map rp: IR+ x 0 -7 Jf acting by the formula
rp(t ,Col) == ft (Col) is measurable, then we can define a family of meas
ures mt E (J(Jf) by mt = rp(t ")p l.e.
for any measurable subset cP c Jf. We shall call rt a stochastic flow
if for all t,s ~ 0,
(2.1)
for any measurable function g on Jf. The relation (2.1) r.leans
that f t +s can be represented as ft+s = rIo fS where fI is a random
map independenL of fS and having Lhe distribution mt. Putting
P(t,x,r) =mdf : fx E f!
-176-
we have by (2.1) the Chapman-Kolmogorov equality
P(t+s ,x ,f) = f P(t ,x ,dy )P(s ,y ,f). II
Thus Xt = ft Xo is a Markov process provided Xo is independent of all
ft. Similarly to Section 1.2 we i.ntroduce the operators Pt
corresponding to transition probabilities P(t ,x ,.) and their adjoints
pt We shall say that a measure p E P(M) is p' -invariant if pis Ptinvariant for any t ~ o.
Lemma 2.1. If M is compact then there exists at least one p' -
invariant measure p E AM).
Proof. In our continuity assumptions Lemma 1.2.2 implies
that for any n there exists a P;/n-invariant measure Pn. Now take
a subsequence n i --> co such that Pn" weakly converges to some pro
bability measure p. If g is a continuous function on M then
(2.2)
= lim f P 1 g dP 1 P i--- t -[tn"l- [tn,,]- n" n" n"
lim f P 1 g dpn" = fgdp i--- t -[tn,,] n"
since Pn" is p'l -invariant and supIPsg(x)-g(x) I --> 0 as s --> o. The - x n"
equality (2.2) is true for each continuous function g which says
that pis pt-invariant. -.
We shall define the metric and the topological entropies hp(f)
and /.(f) of the stochastic flow as h p (f1) and /.(f1) , respectively. This
-177-
definition makes sense in view of the following result.
Lemma 2.2. For any t ~ 0,
(2.3)
and
(2.4)
Proof. From Corollary II.1.2(i) it follows that both (2.3) and
(2.4) are true for all rational t. To prove (2.3) choose an increas
ing family of finite partitions fl -< ~2 -< ... such that \/ fi geni
erates the Borel a-algebra on M and the boundaries
iJAt = Al \ int At of elements At of partitions ~i have p-measure
zero. By Corollary 11.1.2 (iii),
lim h(ft '~i) = h(ft) i-l'oo (2.5)
for any t ~ O. Since
n-l
bn(t,~) = J Hp(\:./i(ft)-l~)dp ~=O
(2.6)
is a subadditive sequence (see the proof of Theorem 11.1.1) then
(2.7)
The boundari.es of all elements of the partitions ~i have p
measure zero and so the boundaries of all elements of the parti
tions (ft(",))-l~i also have p-measure zero for p-almost all "'.
Indeed, if p( G) = 0 then by p. -invariance of p one has
-178-
i.e. p(f -IC) == 0 for tnt-almost all f proving the above assertion. n-l
These imply that the boundaries of all elements of \/ j (ft )-l~i j=O
have p-measure zero. This together with the continuity of the sto-
chastic flow ft in t yield that bn(t'~i) is continuous in t. Thus, by
(2.5)-(2.7),
lim heft) == lim lim heft '~i) t--*to t-+to i-+oo (2.8)
,;;; lim inf llim bn (t ,~d i-+oo n n t -+to
Consider the function 1/I(t) == ! h(F). By Corollary 11.1.2 (i) it is
easy to see that 1/I(rt) "" 1/I(t) for any rational number r > o. On the
other hand from (2.8) it follows that 1f;(t) is upper semi
continuous. Since the rational numbers are dense these two con
ditions can only be satisfied if 1/I(t) == canst, proving (2.3). The
equality (2.4) can be established in the same way by employing t.he
subadditivity argument. to prove the upper semi-continuity of
ll.(ft ) • t .
Next we shall pass to the smooth case where we shall study sto
chastic flows generated by stochastic differential equations on a
compact. m-dimensional Riemannian manifold M of 1C3-class. We
assume that the reader is familiar with the standard machinery
of stochastic differenUal equations which can be found in
-179-
Friedman [15]. For more advanced exposition connected with
stochastic differential equations and stochastic flows on mani
folds and other references on this subject we refer the reader
to Ikeda and Watanabe [22] and Kunita [30].
Consider a diffusion Markov process xt on M which has con
tinuous trajectories and solves a stochastic differential equation
of the form
dXt = L: Vi (Xt )6wl + vo(Xt )dt ts:i:Sm
(2.9)
where +"0"'" vm are smooth [3-class vector fields on M,
Wt = (wi, ... ,wf) is a standard Brownian motion and the
differential OWt is taken in the Stratonovitch form (see Kunita
[30]). One can understand (2.9) in the sense that for any smooth
function 9 on M,
t t
9 (xt)=g (Xc) + J +"o(Xs)ds +~ J vig (Xs)6wi· (2.10) o i 0
Define ft x "" Xt provided Xc = x. Then almost surely ft is a [I-class
diffeomorphism of M for each t 2: 0 ( Kunita [30]). Actually, this is
true under mi.lder assumptions on the vector fields lvi!. Only
Holder continuity of the second derivatives of v!, ... ,vm and the
first derivatives of +"0 is required.
We can rewrite (2.10) in the form
t t
g(ft x ) = g(x) + J vog(fSx)ds + L: J +"i(fSx)6wi (2.11) o i 0
which must be satisfied for any smooth function g. From the Mar
kov property it is easy to see that f t +s = ff 0 f~ for some mutually
-180-
independent random diffeomorphisms ff and f;f having the same
distributions as ft and f S , respectively. Therefore en can be
represented as the composition en = f~ a· .. a fl of mutually
independent random diffeomorphism f[, ... ,f~ having the same
distribution as fl which we denote again by m. We are going to
show that the condition (1.6) holds here and so we can apply
Theorem 1.2 in this case. Actually, we shall prove more than (1.6)
which will enable us to obtain a genuine continuous time version
of Theorem 1.2.
Embed the manifold M together with the diffusion process xt into some Euclidean space IR m with m ~ m and extend the
coefficients of the equation (2.9) into the whole IRm so that they
will remain of [3-class and will be equal to zero outside of some
ball containing M. The extended diffusion process and the
extended coefficients we denote by Xt and ~i' i = 0, ... ,m,
respectively. Thus we obtain the equation
E ~i(Xt)owl + ~o(Xt)dt. l:Si:sm
(2.12)
Notice that we have here, probably, less vector fields ~i than the
dimension mof IRm. Moreover, we do not require that they are not
zero. Define ft x "" Xt provided Xc = x. Then again with probability
one ft forms a family of cLclass diffeomorphisms in IRm. Since we
did not change the coefficients of (2.9) on M itself then M remains ~ ~
invariant for the process Xt i.e., once Xt starts in M it never leaves
M. This means that M is invariant with respect to diffeomorphisms
ft, as well. Moreover r x = ft x for any x E M. Then it follo"ll s that
II Dr Ilx = IIDft Ilx if x EM. (2.13)
Now we can restrict our attention to the case of the Eucledean
space IRm. It is convenient to pass from the Stratonovitch form to
-181-
the Ito form of the stochastic differential equation (2.12). In the
Ito form (2.12) looks as follows
dXt L;r;i (Xt)d wl + ;r;;(Xddt I,,;;i,,;;m
(2.14)
where dWt is the Ito differential,
(2.15)
Again, by the Markov property we can write ft = f~-u a [u where
[u and f~-u are independent random. diffeomorphisms with f~-u
having the same distribution as ft -u, U ~ t. The differentials
Df~-u and (Dr )-1 of the random diffeomorphisms f~-u and (ft )-1
satisfy for any t ~ u ~ 0 the following Ito stochastic integral equa
tion (see Kunita [30]),
t
(Df~-U)", = [+ L j a;r;Jf~-Ux)(Df~-U)",dwi (2.16) l,,;;i,,;;m 'to
t + j a;r;;(f~-Ux )(Df~ -U)", ds
'to
and
t
(Dr),;l = [- L j(DF)';! a;r;i(Fx)dwi (2.17) l,,;;i,,;;m 0
t
- j(DfS ),;!( a;r;;(fS x) + L a;r;i (fS x )a;r; j (fs x) )ds , o l,;;i,j';;m
-182-
where I is the identity matrix. iJYe "" ( iJiJY~)' {D!)z is the restric-Xj
tion of the differential D! on TzM or. more precisely since we are
dealing with Rm. (D!)z is the Jacobian matrix of ! at x. and
{D! );1 is the inverse to (D! )z.
Employing standard martingale estimates for moments of Ito
stochastic integrals (see, for instance Friedman [15]. ch. 4 or
Ikeda and Watanabe [22], Section 3 of Ch. III) and the Cauchy
Schwartz inequality one obtains from (2.16) with u = 0 and (2.17),
in view of the uniform boundedness of all components of iJ~i' that
T
~ sup II (Dr)i'1 11 2 ~ C1 + C2 J ~ sup II (DfS )i'1 1l2 dt (2.18) OstsT 0 Ossst
where C1,C2 > 0 are some constants and ~ is the expectation on
the natural probability space connected with the stochastic
differential equations considered above. Now Gronwall's inequality
applied to (2.18) gives that for any x E M,
(2.19)
This together with (2.13) implies
(2.20)
where C3 = 2C1e C2. Since a 2 ~ log+a for each a > 0 then
In particular, sup~{log+IIDfliz + log+IIDf-11I rz ) ~ C3 . The last z
expectation is simply the integral with respect to the measure m
-183-
which is the distribution of f1 and so (1.6) follows for any measure
p E AM).
Since f1 = fJ-tL a fU then
and so
Hence by (2.21).
(2.22)
~ sup ~ log+ /I Df111x x
Next. for any ~ E TxM one has
(2.23)
where [b] denotes the integral part of a number b. One con
cludes from (2.23) that
lim t-1log II Dft ~ II = lim n -llog II Dm til t ___ CIO n .... - (2.24)
provided that with probability one
-184-
lim n-tA(n,x) = 0 and lim n-1B(n,x) = 0 n ...... oo n ...... oo
(2.25)
for p-almost all x, where
A(n,x)=log+ sup IIDf~llf'x and 0,,;; .... ,.1 (2.26)
P E AM) is a Pt"-invariant measure and Pt(x ,r) = II' Utx E fl is the
transition probability of the Markov process Xt on M. Notice that
by Theorem 1.2 the second limit in (2.24) exists p-a.s. and with
probability one since (2.21) implies (1.6). The proof of (2.25) is
standard. Indeed, by (2.21),
a(n) == Jf!,A(n,x)dp(x) = Jf!,A(O,x)dp(x) ~ C3 (2.27)
and by (2.22),
b(n) == Jf!, B(n,x)dp(x) = Jf!,B(O,x)dp(x) ~ 2C3 (2.28)
since p is Pt"-invarianl. Then for any ~ > 0,
C3 ;?; a(n);?; ~-1 ~ J IPfA(n,x);?; mldp(x) n2:1
(2.29)
and
2C3;?;b(n);?;~-1 L: JII'!B(n,x);?;m!dp(x). n2:1
(2.30)
By the Borel-Cantelli lemma (Neveu [37J) il follows from (2.29)
and (2.30) that for p x p-almost all (x ,c.J) there exists N,,(x,c.J)
such that n-1A(n,x) < ~ and n-1B(n,x) < ~ when n;?; N,,(x,c.J).
-185-
Taking some sequence Ck "'0 one obtains (2.25).
We shall summarize the above results in the following state
ment.
Theorem 2.1. Let ft be a stochastic flow on a compact
Riemannian 1[3-class manifold M satisfying {2.11}. If p E: P(M) is a
pt-invariant measure then one can choose a Borel subset Mp C M
with p(Mp) = 1 such that for any x E: Mp there exist a sequence of
linear subspaces of the tangent space Tz M,
o C flr(z) C ... C III C flO = T M eLz eLz eL z z {2.31}
and a sequence of values
-DO < Pr(z)(x) < ... < Pl(x) < Po(x) < DO {2.32}
such that if ~ E: J!~ \ J!~+1 , where J!~ "'= 0 for i > r (x). then
with probability one
{2.33}
and
lim -tl logllDft liz = Po(x). t .. -
{2.34}
The functions r. Pi. i = O .... ,r and the sub bundles J:i are
measurable and f-invariant i.e., for p-almost all x with probabil
ityone
{2.35}
-186-
Proof. Theorem 2.1 follows from Theorem 1.2 together wi th
(2.24) in the following way. First, we obtain the result for each
ergodic measure p and then apply the ergodic decomposition
(see Appendix). The invariance properties (2.35) follows from
the application of Theorem 1.2 to the compositions ftTL,
n = 1,2, ... which give F-invariant functions r, Pi and subbun-~i
dIes of. But in view of (2.24) these functions and subbundles do
not depend on t i.e. they coincide with r,f3 i and J.'i which proves
the assertion .•
Remark 2.1. If we would like to apply Theorem N.3.2 to sto
chastic flows in order to check whether the maximal characteris
tic exponent is positive or not it would be important to know that
with probability one I det Df~ I = 1. This means that the stochas
tic flow rt preserves the Riemannian volume on M. One can see
that this will be the case if all vector fields 'lii' i = 0, ... ,m gen
erate determinislic flows preserving the Riemannian volume. This
becomes clear when one constructs solutions of stochastic
differential equations by means of successive approximations.
Hence if all 'lit are divergence free or the corresponding Lie
derivatives of the volume element are zero then I det Df:z; I = 1
with probability 1 (cf. Kunita [30], Example 5.4). Using Theorem
TV.3.3 one can formulate some conditions which assure the
non-existence of a measure 11 satisfying (IV.3.4) and so the posi
tivity of f3o(p). These conditions can be given in terms of Lie
algebras generated by the vector fields Yi and their derivatives
in spirit of Hormander's theorem on hypoelliptic operators (see
Ikeda and Watanabe [22], Section 8 of Ch. V).
At the present time there exists a rather big bibliography
concerning stochastic flows. The reader can find corresponding
references in Kunita [30]. Almost all of these works deal with
the differential geometric aspect of the theory. Still there is
growing interest in applications of characteristic exponents to
-187-
stochastic ftows. We shall consider here the following example
producing different characteristk exponents for Lwo stochastic
flows generated by diffusion processes with the same generator.
Example 2.1. (Carverhill, Chappel and Elworthy [11]).
Consider stochastic flows ft and ft solving the following lto
integral equations
and
t ft x = X + f b(fSx)ds + awl
o
t
ftx =x +fb(fsx)ds o
t t
- a f sin?x dwl + af cos 15 dwf o 0
(2.36)
(2.37)
where a is a constant, b is a periodic smooth non-vanishing
function and wl,wl are independent one-dimensional Wiener
processes.
Since all coefficients in (2.36) and (2.37) are periodic we may
view ft and l' as stochastic flows on the unit circle 0. It is easy to
see that the diffusion processes Xt = rtx and Yt = fty have the
t A 1 2 d 2 b ( ) d. ·f same genera or = -a -- + x - SInce I 2 dx 2 dx
a = [~ ~] and .... [-a sin z a(z) = 0
~ cos z]
..... [ 2 g]. then aa = a(x)u'(z) = g According to (2.16), if
-188-
t db Vt(x) = va(x) exp J -- WX)ds.
a dx (2.38)
Therefore the characteristic exponent corresponding to an
invariant measure p is given by
t
pep) = lim llog (va exp J d~Wx)ds) t -000 t 0 dx
(2.39)
. 1 t db db = hm - J -(fSx)ds = r -(x)dp(x) a.s. t -000 t 0 dx ~ dx
by the ergodic theorem. It is well known that the invariant meas
ure p in this situation is unique and its density q solves the
. 1 d 2n d(b~' equatlOn - a 2 ~ - b (x) -~~ = O. The explicit form of q will
2 dx 2 dx
not be imporlanl for us here.
Next if vtCx) =ftx then
dVt(x )=( d~(p x )dt -a cos ft xdwl-a sin ft xdwlWtCx) dx
and so by Ito's formula
t J ~ 1 - a sin rs xdw;--ta 2 !.
a 2
Thus the new exponent P(p)is given by
-189-
t t
where Mt = f cos F d wI + f sin F xd w; is a Wiener process, and o 0
so lim l. Mt = 0 a.s. Finally, we get 'ji(p) = f3(p) - -21 a 2 . t ... oo t
This example shows that the use of characteristic exponents in
the theory of diffusion processes must be restricted to the cases
where the noise can be introduced in a natural uniquely specified
way.
In conclusion, we must mention the results of Baxendale [4]
and Carverhill and Elworthy [12] who studied characteristic
exponents for stochastic flows generated by Brownian motions on
hyperbolic spaces. These involve more extensively the
differential geometric technique of stochastic analysis which
lies oUlside of the framework of this book. Among the results
in other directions we shall mention Kunita's [30] study of sup
ports of stochastic flows in which he finds minimal groups of
diffeomorphisms where all F are contained. This topic is con
nected with our Section 1.1.
Remark 2.2. If the vector fields Yi are of [;m+3_ class then
the stochastic flow ft is of ICm + 1-class and we can prove the
inequality similar to (2.20) concerning the derivatives of F up
to (m+t)-th order. Then by Sobolev's embedding theorem (see
Adams [2]) one can see that
~sup(IIDftll% + II(Dft)-lll%) <00. %
In particular, this together with Theorem 11.2.4 implies that
the topological entropy of ft is finite and so by Theorem 11.2.5 any
-190-
metric entropy is finite, as well.
Remark 2.3. It is clear that (2.21) and (2.24) together with
Theorem 1.1 imply the corresponding version of Oseledec's multi
plicative ergodic theorem for stochastic flows generated by sto
chastic differential equations. Since all characteristic exponents
are flnite in this case then by Theorem 1.4 all metric entropies will
be fini te, as well.
-191-
Appendix
A.i. Ergodic decompositions.
In this section we suppose that M is a Borel subset of a Polish
space and P(x ,. ) is a family of transition probabilities of a Markov
chain ~ on M i.e. P(x ,.) is a Borel probability measure for
each fixed x E M and P(x ,r) is a Borel function of x E M for
any given r from the Borel u- field B(lv!). Next we define the
lransilion operalor P and its adjoint p. in the same way as in
(I.2.8) and (I.2.9). Again, a measure TJ E AM) is called p. -invarianl
if pOTJ = TJ. Furthermore we shall say lhal a Borel subset A eM is
P-invarianl if
( 1.1)
Le. P(x,A) = 1 provided x EA. To connecl lhis definition with
lhe notion of (P,p)-invarianl sels introduced in Section 1.2 we
shall prove
Lemma 1.1.
(i) If A is P-invariant then A is (P,p)-invariant for any p.
invariant measure p E AM):
(ii) Suppose that p E AM) is p. -invariant and B is a (P,p)
invariant subset of M. '!hen there exists a P- invariant subset
B c M such that p(H 11 B) = 0 where 11 denotes the symmetric
difference.
Proof. If p is P-invariant then J PXA dp = JXAdp = p(A)
which togelher with (1.1) implies (i). Next, let Be M be (P,p)-
-192-
invariant i.e.
PXB = XB p-a.s. (1.2)
Then there is a Borel set C :J B such that p( C '\ B) = O. Since p is
p' -invariant then by (1.2),
p( 8) = p( C) = J P(x, C)dp(x) ~ J P(x ,B)dp(x) ;?:
M M
;?: J P(x ,B)dp(x) = p(B). B
Hence
P(x,C)=P(x,B) p-a.s. ( 1.3)
and so
PXc = Xc p-a.s. ( 1.4)
Now define inductively Co = C and Ci + 1 = Ix E Ci : P(x,Ci ) = 11.
i = 1,2, .... Since P(x, Cd are Borel functions of x then all Ci are
Borel sets. Besides, Co:J C1 :J C2 :J . . . and by (1.4),
p( Co '\ C 1) = o. But then
p(C1) = J P(x,C1)dp(x) = J P(x,C 1)dp(x). M C1
This together with (1.4) give
Repeating this a.rgument we obtain that
-193-
PXc, = Xc< p-a.s and p(Ci " Ci +1) = 0 for all i = 0,1,2, '(l:5)
Finally, B'= n Ci satisfies the conditions of (ii). Indeed, if i2:0
xEB then P(x,Ci )=l for all i=0,1,2,···. Since P(x,·) is a
measure then also P(x ,B) = 1 and so PX'jj ~ X'jj' Besides, (1.5)
implies p( CO" B) = 0 which concludes the proof. •
Remark 1.1. The collection ---4 of all P-invariant sets does
not form a u-field since not every A E ---4 has the complement
belonging to ---4. Still, by Lemma 1.1 given any p' -invariant
measure T} the completion ---47] of ---4 coincides with the family of
(P,T} )-invariant set and so it forms a u-field. On the other hand if
-we add to ---4 complements of all sets then the new collection ---4 will
be already au-field.
Next, we shall call a p' -invariant measure p ergodic if peA) = 0
or = 1 for any A E ---4. By Lemma 1.1 it is easy to see that this
definition coincides with the definition given in Section 1. 2. Let m be the space of all p' -invariant probability measures. We shall
introduce a measurable structure on m by saying that any
function G(T}) == J gdT} on m is measurable provided g is a func
tion on M measurable with respect to the completions of the Borel
u-field for any p' -invariant probability measure. The main
result of this section is the following (cf. Rohlin [40] for the deter
ministic dynamical systems).
Theorem 1.1. The set me of all ergodic measures is a
measurable subset of m and each measure 7J from m CCLn be
uniquely represented as an integral
7J == J pdvTJ m. (l.6)
-194-
i.e.
1/(r) = J p(r) d v7J(p) m. (1.7)
for any Borel reM where 117J is a probability measure on m con
centrated on me.
The proof of this theorem proceeds in the same way as in
Kifer and Pirogov [24] by constracting certain conditional proba
bilities. We shall start with
Lemma 1.2. (Dynkin [14J) There exists a function q on M
such that the family of functions Wq = I l,q ,q2, ... I has the fol
lowing properties.'
(i) Suppose that a set of functions JJ contains Wq and
satisfies the conditions:
a) if 9 1,g 2 E JJ then for any numbers c 1 and c 2,
c 19 1 + c 2g 2 E JJ; b) if a sequence f n E JJ is uniformly bounded and point
wise converges to 9 then 9 E JJ;
Then JJ contains all bounded Borel functions.
(ii) Wq separates probability measures on M i.e. for any two
different measures 7}1,7}2 E P(M) there exists an integer k such
that 7Jl(qk) ~ 7J2(qk).
(ii) If for 7Jn E AM) the sequence TIn (g) converges for each
9 E Wq then there exists a probability measure 7} such that
TIn (g) -~ TI(g) when 9 E Wq .
Proof. According to §36-37 of Kuratowski [31] any Borel
subset M of a Polish space is Borel measurably isomorphic to a
closed subset of the interval I = Ix : a ~ x ~ 11 considered wi th
its Borel a-field B(u). This isomorphism is given by a function
-195-
q q : M --> n. We shall introduce convergences of points by xn -> x if
q q(xn ) -> q(x), and measures by J.Ln -> J-L if JqkdJ.Ln -> JqkdJ.L for
all k :=: 0,1,2, ....
With respect to this q-topology the spaces M and AM) are
compact since if we shall identify x with q (x) then M is
transformed into a compact subset of n and AM) is
transformed into a compact subset of p(n) considered with the
topology of weak convergence.
To prove that Wq = ! 1,q ,q2, ... J satisfies (i) notice that if .JJ satisfies a) then .JJ must contain all polinoms and so by b) .JJ con
tains all bounded Borel functions.
If 1/1,1/2 E: AM) and J q k d1/1 = J q k d1/2 for all k = 0,1,2, ...
then the set .JJ of bounded Borel functions such that
JgdJ.L1 = JgdJ.L2 satisfies a) and b) and so by (i) it contains all
bounded Borel functions, proving (ii).
To prove (iii) remark that any measure 1/ which is a limit
point of a sequence TJn in q -topology satisfies
for all k = 0,1,2, ... and so by (ii) it follows that this sequence
has a unique limit point i.e. it converges, proving (iii) .•
Proof of Theorem 1.1. By a partial case of the Chacon
Ornstein theorem due to E. Hopf (see Neveu [37], Proposition V.B.3
or Rosenblatt [41], Corollary 2 of Section 2 in Ch. N) if
Jig IdTJ < oa then 1/-almost surely the limit
-196-
( 1.8)
exists where T) E AM) is p' -invariant, .AT] was defined in Remark
1.1. ET](g I.AT]) denotes the conditional expectation on the proba
bility space (M.BT].T)) with respect to the u-field .AT] and BT] is the
completion of the Borel u-field with respect to T). This means
that ET](g I.AT]) is an .AT]-measurable function satisfying
( 1.9)
for any A E .AT]'
Let 'iii be Lhe set of those x for which the limit (1.8) exists for
all functions from Wq constructed in Lemma 1.2. Then it fol
lows that Xi E.AT] and T)(M) = 1 for any p' -invariant T) E P(M). If
6z denotes the unit mass at x then we can write
provided g E Wq and x E Xi. Hence by (ii) and (iii) of Lemma 1.2
there exists a unique measure pZ E P(M) such that if x EM then
g(x) = f gdpX for any g E Wq . (1.10)
Therefore for any AT/-measurable function h (i.e. all sets
fx : h (x) < a l belong to ..4,) one has
(1.11)
provided T) E AM) is p' -invariant. From (i) of Lemma 1.2 it fol
lows that (1.11) holds for any bounded Borel functiong. Since
-197-
g IS --47]-measurable then / gdpX as a function of x is --47]
measurable for each g EO Wq and so by (i) of Lemma 1.2 it is
~7]-measurable for any bounded Borel function g.
This together with (1.11) give
7}-a.s. ( 1.12)
It follows from (l.8) and (1.10) that for any g EO Wq and x EO Xi,
/ g dpx = /PgdpX = /gdP'px
and so by Oi) of Lemma 1.2,
p' pX = pX for all x EO M. ( 1.13)
Next, we shall proceed in the same way as in Dynkin [14],
Theorem 2.1. Let 7} EO me then for any bounded Borel function g,
and so by (1. 12),
/gd pX=/gd7} TJ-a.s. (1.14)
This is true, in particular, for all g EO Wq which together with (ii)
of Lemma 1.2 imply that
TJfx : pX = 7}1 = 1 (1.15)
i.e. in this case 7} coincides with one of the measures pX. In other
words
-198-
( 1.16)
On the other hand if (1.15) is true then by Lemma 1.1 for any
P-invariant set A
(1. 17)
and so 71 is ergodic. Thus (1.15) is equivalent to the ergodicity of
71. But fx :px =7J! = Ix: !gdpX = fgd7J for all g E Wq !. Hence
we can say that 71 is ergodic if and only if
!gdpX = ! g dp for all g E Wq 7J-a.s. ( 1.18)
This is equivalent to
(1.19)
for all g E Wq which implies that me is a measurable subset of
m.
Next we are going to show that the measures pX are ergodic 71-
u.s. for any p' -invariant measure 71. Indeed, by (1.12),
( 1.20)
Taking an integral in (1.20) with respect to a p' -invariant
measure 71 we shall obtain in view of (1.12) that.
Since tPg(pX);;:>: 0 it follows that
-199-
(1.21)
which implies. as we have seen it above. the ergodicity of those pZ
which satisfy (l.21).
Now by (1.12).
( 1.22)
and putting v,,(G) = 1/{x : pZ E GI one obtains the desired
representation (1.6). To get the uniqueness notice that for any
measurable subset G of me.
71{X : pZ E Gl = J p(x : pZ E Gldll1}(p) m.
since pIx: pZ c G! = XG(p) provided p E me. This completes the
proof of Theorem 1. l. •
Remark 1.1. The map rp : M .... me acting by rp(x) = pZ deter
mines also a measurable partition (see Rohlin [40]) of Minto
pre-images rp-l(p) which are called ergodic components.
Remark 1.2. Tn the circumstances of Section 1.2 we may
need an ergodic decomposition of 1/ x p. But if 71 has an ergodic
decomposition 1/ = J pd lI(p) then 71 x P has the ergodic decompo
sition 1/ x P = Jp X pdll(p).
-200-
A.2. Subadditive ergodic theorem.
We shall prove in this section Kingman's subadditive ergodic
theorem [29] under the circumstances when an ergodic decom
posilion exists which is the case of the main interest in this book
A proof for lhe general case the reader can find in the original
Kingman's paper [29]. A shorler proof was given by Derriennic [13J
(see also Appendix A of Ruelle [43]).
We shall start with the ergodic case where we shall follow
Ledrappier [33]. Suppose that f is a measure-preserving transfor
mation of a probability space (M,£.JL) i.e. JL(f -lE) = J.L(E) for any
EEl where lis a a-field of measurable subsets of M and J.L(M) = l. An f -invariant measure J.L is called ergodic if any f -invariant sel
A E £, i.e. f-tA = A, satisfies J.L(A) = 0 or 1.
Theorem 2.1. Let a sequence of functions g l,g 2' ... satisfies
g t E (bl(M,JL) and gn+m S; gn + gm a rn. If JL is ergodic then JL-a.s.
there exists a limit
lim 1... gn = C =0 inf 1... J gn dJ.L. n ..... co n n n
(2.1)
Proof. Notice, first that the sequence f gn d J.L is subadditive
and so, by the well known argument which we have demonstrated
already in the proof of Theorem II.1.l,
(2.2)
Introduce
g(x)=limsup 1... gn(x) and g(x) = liminf 1... gn(x). n ..... - n - n-+ OCI n
-201-
Then!J and g are f -invariant. Indeed, in view of the subadditivity
1 !J 0 f = limsup -(g of) '/1. n n
and similarly, g 0 f ~ g p.-a.s. Since
the f -invariance of !J and g follows. But j.L - is ergodic and so tJ
and g are j.L-a.s. constants.
Next, we shall show that
g~ c. (2.3)
Indeed, assume that g ~- 00 and take an arbitrary ~ > O. One
can choose a measurable function n (x) such that j.L-a.s.
gn(x) s; n(x)(g + ~). (2.4)
For N > 1 put AN = fn(x) ~ N! n fx : (2.4) is not true! and define
Ig(X) if x E M '\ AN
g(x) = max(~,g l(x)) if x E AN
and
-202-
~ {n(x) if x EM\ AN n (x) = 1 if x E AN .
Then by the subadditivity condition,
Define by induction
where no(x) = 0 and n1(x) = n(x). For any integer P > N put
By the subadditivity,
+
Summing (2.5) taken at each jnj(x)x we have
gp(x) ~ ~ (g+~)(fix) O:5.j:5.njp(z)(z)-l
I: gt(fj(x)). n jp(')('r'j~-l
Since n »(x)(x) ~ P-N then (2.7) implies
(2.5)
(2.6)
(2.7)
-203-
gp(x),,;;; E (g+l:)(fiX) 0-5.j-5.P-N-l
(2.8)
E (g++gt +l:)(fi X ) j..L-a.s. P-N-5.j-5.P-l
Therefore
(2.9)
Letting P -> 00, then N -> 00 and, finally, l: ..... 0 we derive (2.3) pro
vided g >- 00. If IL =- 00 then for each K one can find a measurable
integer-valued k (x) > 0 such that
gk(z) ,,;;; k (x) K (2.10)
The same proof as above with (2.10) in place of (2.4) enables us to
show that c ,,;;; K. This is true for any K and so c =- 00 ,,;;; g proving
(2.3) completely.
We shall need also the following inequality
(2.11)
The proof of (2.11) is similar to the arguments above. Suppose
that!l < 00 and take an arbitrary f; > o. One can find a measurable
function n (x) such that j..L-a.s.
n(x)g,,;;; gn(z) + en(x). (2.12)
By the subadditivity,
-204-
n(~-l . n(x)ff~ L.; (gl+l:)(f"X).
i=O
Put, again AN = In(x) ~ NI n lx ; (2.13) is not truel and
and
Then
{gl(X) if x EM\ AN
g(x) = max(gl(x),g) if x E AN
{n(x)
n (x) = 1 if
n(x)g ~ L; (g + e)(fix). OSisn(x)-l
(2.13)
(2.14)
In the same way as in the proof of (2.8) we can derive from (2.14)
for any integer P > N that
Pg ~ "L; (g + l:)(fjx) OSjSP-N-l (2.15)
+ "L; (.11+ + g+ + l:)(fix). P-N5.jSP-l
Letting P -+ "" one obtains .11 ~ fgdJ.£ + l:. When N -+ "", fgdJ.£
tends to f g IdJ.£ which gives (2.11) after taking l: -+ 0 for the case
ff < 00. When ff = 00 then
k (x)K ~ gk(x)(x) + l:k (x) (2.16)
for some K > fgldf..L and a measurable function k(x). The same
proof as above gives K ~ f g Id J.£ for any K and so f g 1 df..L = "" which
-205-
is impossible.
Now we can assert that for any integer j > 0,
(2.17)
which together with (2.3) proves (2.1). Indeed, if (2.17) is true
then
U ~ lim ..!:- Jg.dJL = c. j ... _ J J (2.18)
Since U ~ 9 lhen by (2.3) and (2.18),
U =g = c (2.19)
proving (2.1).
It remains to establish (2.17). Put Uj = limsup 1... gjn' It is n n
easy to see in the same way as at the beginning of the proof that
Uj is a constant JL-a.s. Moreover
Uj = j U· (2.20)
Indeed, ffj ~j U since in the definition of Uj lhe limsup is taken
along a sUbsequence. On the other hand, by the subadditivity
gn ~gkj + I; g1 0 jkj+i OstSj -1
where k is the integral part of n.
(2.21)
-206-
Notice that
(2.22)
Indeed,
(2.23)
since JL is f -invariant. Thus by the Borel-Cantelli lemma (see, for
instance, Neveu [37]) the left hand side of (2.20) is less or equal to
6 JL-a.s. Since 0 is arbitrary we obtain (2.22).
Now (2.22) applied to (2.21) gives Y ~ 1:- Yj which together with J
the inequality in the opposite direction proved earlier give (2.20).
Next, we can use (2.11) with Yj and fj in place of Y and f, respec
tively, to obtain Yj ~ J gjdJ.l. which together with (2.20) gives
(2.17). As we have explained it above this completes the proof of
Theorem 2.1. •
Next, we shall consider a non-ergodic case.
Corollary 2.1. Let in the conditions of Theorem 2.1 a meas
ure JL is not necessarily ergodic but it can be represented as an
integral
J.I. = J pdll(p) {2.24}
over the space of f -invariant ergodic measures. Then
-207-
~ 1. 1 g = 1m - gn I-l-a.s. n....,.ao n (2.25)
exist and g a f = g, Wa.s.
Proof. Let M = fx : the limit (225) does not exist!. Then by
Theorem 2.1 p(M) = 0 for any ergodic p and so by (2.24),
J-L(iJ) = o. This means that the limit g exists Wa.s. Consider
h = g - go f. As we have seen it at the beginning of the proof of
Theorem 2.1 g is f -invariant p-a.s. with respect to any p.-
invariant ergodic p and so h = 0 p-a.s. Let M = !x : h ~ Ol then by
'" '" (2.24), J-L(M) = Jp(M)dv(p) = 0 and so h = 0 wa.s. Hence go f = g J-L-a.s. concluding the proof. •
Remark 2.1. To be sure that an ergodic decomposition exists
we can employ Theorem 1.1 of Appendix for the case when
P(x ,. ) = o/x where Oy is the Dirac measure at y. The conditions of
Theorem 1.1 will be so.tisfied if the space under consideration is a
Borel subset of a Polish space.
Remark 2.2. According to Remark 1.2, in order to apply
Corollary 2.1 Lo the transformation T from Section 1.2 we only
have to be sure that a certain p. -invariant measure T] on M has an
ergodic decomposition. If M is a Dorel subset of a Polish space then
by Theorem 1. 1 this will be the case.
-208-
References
[1] L. M. Abramov and V. A. Rohlin. The entropy of a skew product of measure
preserving transformations. A.M.S. Trans!. Ser. 2. 48 (1966). 255-265.
[2] R. Adams. Sobolev spaces. Academic Press. New York. 1975.
[3] D. V. Anosov and Ya. G. Sinai. Some smooth ergodic systems. Russian Math.
Surv .. 22 N5 (1967). 103-167.
[4] P. H. Baxendale. Asymptotic behaviour of stochastic flows of
diffeomorphisms: two case studies. University of Aberdeen. Preprinl.
l5] P. H. Baxendale. The Lyapunov spectrum of a stochastic flow of
diffeomorphisms. University of Aberdeen. Preprint.
[6] R. M. Blumenthal and H. K. Corson. On continuous collections of measures.
Proc. 6th Berkeley Symp. on Math. Stat. and Prob .. v. 2. 1972. 33-40.
[7] R. M. Blumenthal and H. K. Corson. On continuous collections of measures.
Ann. Inst. Fourier. Grenoble 20 (1970). 193-199.
[8] N. Bourbaki, Elements de Mathematic. livre VI. Integration. ch. 6 .. Integra
tion vectorielle. Hermann. Paris. 1959.
[9] A. Carverhill. Flows of stochastic dynamical systems: ergodic theory. Sto
chastics. 14 (1985). 273-317.
[10] A. Carverhill. A "Markovian" approach to the multiplicative ergodic
theorem for nonlinear stochastic dynamical syst.ems. Stochastics. to
appear.
[11] A. P. Carverhill. M. J. Chappel and K. D. Elworthy. Characteristic exponents
for stochastic flows. University of Warwick. Preprint.
[12] A. P. Carverhill and K. D. Elworthy. Lyapunov exponents for a stochastic
analogue of the geodesic flow. University of Warwick. Preprint.
[13] Y. Derriennic. Sur Ie theoreme ergodic sous-additif. C. R. Acad. Sc. Paris.
2B1 (1975). A. 9B5-9BB.
[14] E. B. Dynkin. Initial and final behaviour of trajectories of Markov processes.
Russian Math. Surveys 26 N4 (1971).165-185.
[15] A. Friedman. Stochastic differential equations and applications. vols. 1,2.
Academic Press. New York. 1975.
-209-
[16] H. Furstenberg, Noncommuting random products, Trans. Amer. Math. Soc.
108 (1963), 377-428.
[17] H. Furstenberg and Y. Kifer, Random matrix products and measures on pro
jective spaces, Israel J. Math. 46 (1983), 12-32.
[18] Y. Guivarch, Marches aleatoires a pas markovien, C. R. Acad. Sc. Paris 289
(1979),211-213.
[19] E. Hewitt and K. Stromberg, Real and abstract analysis, Springer-Verlag,
New York, 1965.
[20j M. Hirsch, Differential topology, Springer-Verlag, New York, 1976.
[21] K. lchihara and H. Kunita, A classification of the second order degenerate
elliptic operators and its probabilistic characterization, Z. Wahrsheinli
chkeitstheorie Verw. Gebiete 30, 235-254 (1974).
[22].N. Ikeda and S. Watanabe, Stochastic differential equations and diffusion
processes, North-Holland/Kodansha, Amsterdam, 1981.
[23] S. Kakutani, Random ergodic theorems and Markoff processes with a stable
distribution, Proc. 2nd Berkeley Symp., 1951, 247-261.
[24] Yu. 1. Kifer and S. A Pirogov, The decomposition of quasi-invariant meas
ures into ergodic measures, Russian Math. Surveys, 27 N 1 (1972), 79-83.
[25] Yu. Kifer, Perturbations of random matrix products, Z. Wahrscheinli
chkeitstheorie. Verw. Geb. 61 (1982), 83-95.
[26] Yu. Kifer and E. Slud, Perturbations of random matrix products in a reduci
ble case, Ergod. Th. & Dynam. Sys. (1982), 2, 367-382.
[27] Yu. Kifer, Characteristic exponents of dynamical systems in metric spaces,
Ergod. Th. & Dynam. Sys. 3, (1982), 119-127.
[28] Yu. Kifer, A Multiplicative ergodic theorem for random transformations,
Journal D'Analyse Mathemdtique (1985).
[29] J. F. C. Kingman, The ergodic theory of subadditive stochastic processes, J.
Royal Statist. Soc. 830 (1968), 499-510.
[301 H. Kunita, Stocbastic differential equations and stochastic flows of
ditIeomorphisms, Ecole d'Ete de Probabilitee de Saint-Flour XlI - 1982, Lec
ture Notes in Math. 1097, pp. 143-303. Springer-Verlag, 1984.
[31] K. Kuratowski, Topology, Vol. 1, Academic Press, New York, 1966.
[32] F. Ledrappier and P. Walters, A relativised variational principle for continu
ous transformations, J. London Math. Soc. (2), 16 (1977), 568-576.
[33] F. Ledrappier, Quelques proprietes des exposants caracteristiques, Ecole
d'Ete de Probabilitee de Saint-Flour XlI - 1982, Lecture Notes in Math. 1097,
-210-
pp. 305-396. Springer-Verlag. 1984.
[34] F. Ledrappier. Positivity of the exponent for stationary sequences of
matrices. Universite Paris VI. Preprint.
[35] R. Mane. A proof of Pesin's formula. Ergod. Th. & Dynam. Sys. (1981).1. 95-
102.
[36] N. Martin and J. England. Mathematical theory of entropy. Addison-Wesley.
Reading. Mass. 1981.
[37] J. Neveu. Mathematical foundations of the calculus of probability. Holden
Day. London. 1965.
[38] T. Ohno. Asymptotic behaviours of dynamical systems with random param
eters. Pub!. R.I.M.S. Kyoto Univ .. 19 (1983). 83-98.
[39J K. Petersen. Ergodic theory. Cambridge Univ. Press. Cambridge. 1983.
[40] V. A. Rohlin. Selected topics from the metric theory of dynamical systems.
Amer. Math. Soc. Trans!.. Ser. 2. 49 (1966). 171-240.
[41] M. Rosenblatt. Markov processes. Structure and asymptotic behavior.
Springer-Verlag. Berlin. 1971.
[42] G. Royer. Croissance exponentieUe de produits markoviens de matrices
aleatoires. Ann. I.H.P. 16 (1980). 49-62.
[43] D. Ruelle. Ergodic theory of differentiable dynamical systems. Publ. IHES
50 (1979). 275-306.
[44] S. M. Ulam and J. von Neumann. Random ergodic theorems. Bull. Amer.
Math. Soc. 51. N9. (1947). 660.
[45] A. D. Virtser. On product of random malrir..:es and operators. Theor. Prob.
App!. 24 (1979). 367-377.
[461 P. Walters. An introduction to ergodic theory. Springer-Verlag. New York.
1982.
Already Published in Progress in Probability
and Statistics
PPS 1 Seminar on Stochastic Processes, 1981 E. (:mlar, K. L. Chung, R. K. Getoor, editors ISBN 3-7643-3072-4, 248 pages, hardcover
PPS 2 Percolation Theory for Mathematicians Harry Kesten ISBN 3-7643-3107-0, 432 pages. hardcover
PPS 3 Branching Processes S. Asmussen, H. Hering ISBN 3-7643-3122-4,472 pages, hardcover
PPS 4 Introduction to Stochastic Integration K. L. Chung, R. J. Williams ISBN 0-8176-3117-8 ISBN 3-7643-3117-8, 204 pages, hardcover
PPS 5 Seminar on Stochastic Processes, 1982 E. (:mlar, K. L. Chung, R. K. Getoor, editors ISBN 0-8176-3131-3 ISBN 3-7643-3131-3, 302 pages, hardcover
PPS 6 Least Absolute Deviation Peter Bloomfield, William L. Steiger ISBN 0-8176-3157-7 ISBN 3-7643-3157-7
PPS 7 Seminar on Stochastic Processes, 1983 E. (:mlar, K. L. Chung, R. K. Getoor, editors ISBN 0-8176-3293-X ISBN 3-7643-3293-X, 290 pages, hardcover
PPS 8 Products of Random Matrices with Application to Schrodinger Operators Philippe Bougerol, Jean Lacroix ISBN 0-8176-3324-3 ISBN 3-7643-3324-3, 296 pages, hardcover
PPS 9 Seminar on Stochastic Processes, 1984 E. (:mlar, K. L. Chung, R. K. Getoor, editors ISBN 0-8176-3327-8 ISBN 3-7643-3327-8,258 pages, hardcover