Random Variables and Probability H Cramer (CUP 1962 125s)

RANDOM VARIABLESAND

PROBABILITY DISTRIBUTIONS

BY

HARALD CRAMERli'o,.~nef' ('hane81Jo.r of the Su)edi8h, Uni-verBitltSand PifojeR9or tn the Un1.t>erBity of StockAolrn

CAMBRIDGE

AT THE UNIVERSITY PRESS

1962

PUBLISHED BY

'fH:P: $). NDICS OF THE CAl\IBRIDGE UNIVERSITY PRESS

BelltlP 3 Hot1s~, 200 Euston Roa.d, London. N.'V.lAmerican Branch· 32 Eac;t 57th Street, ~e\\' York 22, ~_y

\V'fi'>bt AfrIcan Office: P.0. Box 33. Ibadan, Niflcria

First printed. 1937

$eg,onil IxhUm/, 1962

First printed at th.e UmtJt1"$ity Pru8, Oambridge

Reprinted by o1faet..lithography by Bradford &: Diclcens, London. W ..O. J

CONTENTS

Preface to the First Edition

Preface to the Second Edition "

AbbreviationB "

FIRST PART. PRINCIPLES

'Page vi

vii

Ohap. I. Introductory remarks 1

II.. Axionls and preHminary theorems .. 9

SECOND PART. DISTRIBUTIONS IN R1

III. Gellerai properties. Meall values 18

IV. Characteristic functions " 24

v. Addition of independent variables .. Conver..gence "iIi probability". Special distributions 3t)

VI. The normal distribution and the central limittheorem 50

VII. Liapounoff's theorem. Asynlptotic expansions !70

VIII. A class of stochastic processes 90

THIRD PART. DISTRIBUTIONS IN Rk

IX. General properties. Characteristic fUllctions 101

X. The normal distribution and the central limittheorem 110

Bibliography .. 116

Some Recent Worles on Mathematical Probability 119

H. C.

PREFACE

The Mathematical Theory of Probability has lately beconle ofgrowing importance owing to the great variety ofits applications,and also to its purely mathematical interest. The subject, of thistraot is the development of the purely mathematical side of thetheory, without any reference to the applications. The axiomaticfoundations of the theory have been chosen in agreement withthe theory given by A. Kolmogoroff in his work Grundbegrijje derfVahr8cheinlichkeitBrechnung, to which I am greatly indebted. Inaccordance with this theory, the subjeot has been treated as abranch of the theory of completely additive set ftmotions. Themethod principally used has been that of characteri8tic functi0ft.8(or Fourier-Stieltjes transforms).

The limitation of space has made it necessary to restrict theprogramme somewhat severely. Thus in the first place it haspro,red necessary to consider exclusively probability distributions in spaces of a finite number of dimensions. With respect tothe advanced part of the theory, I ha.ve found it convenient toconfine myself almost entirely to problems connected with theso~called Gentral Limit Theorem for sums of independent variables, and with some of its generalizations and modifications invarious directions. This limitation permits a certain uniformityof method, but obviously &. great number of important andinteresting problems will remain unmentioned.

My most sincere tha.nks .are due to my friends W. Feller,O. Lundberg and H. Wold for valuable help with the preparationof this work. In p&rtioular the constant assistance and criticismof Dr Feller has been very helpful to Ine.

lJeparime.nt of MatAe:mCltietil StatisticsUnit'u8itg 0/ StockholmDecember 1936

PREFACE TO THE SECOND EDITIOK

This Tract has now been out ofprint for a number ofyears. Sincethere still seems to be some demand for it, the Syndics of theCambridge University Press have judged it desirable to publisha new edition.

However, owing to the vigorous development ofMathematica.lProbability Theory since 1937, any attempt to bring the book upto date would have meant rewriting it completely, a task thatwould have been utterly beyond my possibilities under presentconditions. Thus I have had to restrict myself in the main to anumber of minor corrections, otherwise leaving the workincluding the Bibliography-where it was in 1937.

Besides the minor"corrections, most of which are concernedwith questions of terminology, there a.re, in fact, only two majoralterations. In the first place, a serious error in the statementand proof of Theorem 11 has been put right.. Further, the contents of Chapter IV, §4, which are fundamental for the theory ofasymptotio expansions, etc.. developed in Chapter VII, have beenrevised and simplified. This permits a new formulation of theimportant Lemma 4, on which the proofs of Theorems 24-26are based. Finally a brief list of recent works on the subject inthE" English language has been added.

H.C..

Univer8ity OAa:nullor·1f OJlirfStockholmMaTt1l.1960

ABBRE"\TIl\TIONS AND NOTATIONS

Hymbol Rignificatlon Explanation

d.f. Distribution function page 11

pr..f. Probabihty function 11

B.d.. Standard de"iation 21

E(X) lIean value (or mathematical 20expectation) of X

D (J::; Standard da\iation of }[ 21

c..£ Characteristic function 24-

F (x) = F1 (x) *Fz(x) F(:c)= J:", 1'1 (:c-t)dF2 (t) 37

conv"'ergence Lpr. Convergence in probability 39

(F (x))n* F (x).F (x) * .... (11, times) 53

The u?l,ion or BUm of any finite or enumerable sequence of sets8 1, S'j., ... ~ is denoted by

8=81 +82 + ....The- intersection or product of the sets 8 1, 8 2, .... is denoted by

8=8182 .....

The inclusioll SigIl c is used in relations of the type S1 c Sindicating that ;..91 is a subset ofS, and also in relations ofthe typex c ;.\1 t.o express the fact tllat x is an element of the set S.

FIRST PART

PRINCIPLES

CHAPTER I

INTRODUCTORY REMARKS

1" In the most varied fields of practical and scientific experiene-e, cases ocour where certain observations or trials may berepeated a large number of times under similar circumstances.Our attention is then directed to a certain quantity, which mayassume different numerical values at successive observations.In many cases each observation yields not only one, but a certainnumber of quantities, say k, so that generally we may say thatthe result of each observation is a definite point X in a space ofIe dimensions (k ~ 1), while the result of the whole series of observations is 8t sequence of points: Xl' XI' ....

Thus if we make a, series of throws with a given number ofdice, we may observe the sum of the points obtained at eachthrow.. We are then concerl1ed with a variable quantity, whichmay assume every integral value between m and 8m (both limitsinclusive)~where m is the number ofdice. On the other hand, in aseries of measurements of the state of some physical system, orof the size of certain organs in a number of individuals belongingto the same biological species, each observation furnishes acertain number ofnurp.erical values, i.e. a definite point in a spaceofa fixed number of dimensions.

In certain cases, the observed characteristic is only indirectlyexpressed as a number. Thus if, in a mortality investigation, weobserve during one year a large number of persons:J we may ateach observation (i.e. for each person) note the number of deatkawhich take place during the year, so that in this case the observed

I

2 INTRODUOTORY RE1IARKS

quantity assumes the value 0 or 1 according as the correspondingperson is alive at the end of the year or not.

In a given class of observations, let R denote the set of pointswhich are a priori possible positions of our variable point X, andlet S be a sub..set of R. Further, let a &eries of n observations bemade, and count the number v of those observations, where thefollo'\\ing eve-nt takes place: the point X determined by the ob8ervatz-on belongs to S. Then the ratio vIn is called the frequency of thatevent or, as ,ve may shortly put it, the frequency of the relation(or event) XeS. Obviously any such frequency always liesbetween 0 and 1, both limits inclusive. If 8=81 +lJ2, where 81

and 82 have no common point, and if vl/n and v2!n are thefrequencies corresponding to 81 and 82, we obviously hayev=vl+vS and thus

(1) vjn= Vl/n+ v,.!n.

When we are dealing with such frequencies, a certain peculiarkind of regularity very often presents itself. This regularity maybe roughly described by saying that, for any given sub...set S,the frequency of the relation (or event) XeS tends to becmne moreor leas constant as n increa8e8. In certain cases, such as e.g. casesof biological measurements, our observations may be regardedas samples from a very large or even infinite population, so thatfor indefinitely increasing n the frequency would ultimatelyreach an ideal value, characteristic of the total population.

It is thus suggested that in cases where the above-mentionedtype of regularity appears, we should try to introduce a numberP (S) to represent such an ideal value of the frequency v /71, corresponding to the Bub-set S. The number P (8) is then called theprobability oj tke 8ub-Bet S, or of tke event Xc 8. It follows from(1) that we should obviously choose P (8) such that

(2) P (81 +8.) = P (81) +P (82)

for any two Bub-sets 81 and loS! of R which have no common point.Further, it is obvious that we should always have P (8) ~ 0 andthat for the particular set S == R we should have P (R) == 1..

INTRODUOTORY REl\IABKS 3

The investigation of 8et fUMtions of the type P (8) and theirmutual relations is the object of the Mathematical Theory ofProbability. This theory should be considered as a branch ofPure Mathematics, founded on an axiomatic basis, in the samesense as Geometry or Theoretical ~Iechanics.l Once the fundamental conceptions have been introduced and the axioms havebeen laid down (and in this procedure we are, of QOurse, guidedby empirical considerations), the whole body ofthe theory shouldbe constructed by purely mathematical deductions from theaxioms. The practical value of the theory will then have to betested by experience, just in the same way as a theorem ineuclidean geometry, which is intrinsically a purely mathematicalproposition, obtains a, practical value because experience showsthat euclidean geometry really conforms with sufficient accuracyto 8. large group of empirical facts.

2. The axiomatic basis of a theory may, of course, always beconstructed in many different ,vays, and it is well known that,with respeot to the foundations of the Theory of Probability,there has been a great diversity of opinions.

The type of statistical regularity indicated above was firstobserved in connection with ordinary games of chance withcards, dice, etc., and this gave occasion to the origin and earlydevelopment of the theory.2 In every game of this character. allthe results that are a priori possible may be arranged in a finitenumber ofcases which are supposed to be perfectly symmetrical.This led to the famous principle oj equally P088ible ca8eR which,.after having been more or less tacitly assumed by earlier writers,was explicitly framed by Laplace [1], as the fundamental principle ofthe whole theory. Throughout the whole centuryfollowingthe publication of Laplace's classical treatise, a, large amount ofwork has been spent on the discussion of this principle.

During the course of this discussion, it has been maintained

1 This view seems to have been first explicitly expressed by v. Mises [2J.• Of. Todhunter [lJ.

4 INTRODUCTORY REMARKS

by various authors that the validity of the principle of equallypossible cases is necessarily restricted to the field of games ofchance, so that it is wholly incapable of serving as the basicprinciple of the theory. Attempts have been made! to establishthe theory on an essentially different basis, the probabilitiesbeing directly defined as ideal values of statistical frequencies.The most successful attempt on this line is due to v. Mises [2, 3],who endeavours to reach in this wayan axiomatic foundation ofthe theory in the modern sense.

The fundamental conception of the v. Mises theory is that of a" Kollektiv", by which is meant an unlimited sequence K ofsimilar observations, each furnishing a definite point belongingto an a priori given space R of a finite number of dimerlsions..The first axiom of v. Mises then postulates the existence of thelimit

(3) lim VI1~=P(S)n.~CX)

for every simple sub-set ScR, while the second axiom requiresthat the analogous limit should still exist and have the samevalue P (S) for every sub-seq'llence K ' that can be formed fromK according to a rule such that it can always be decided whetherthe nth observation of K should belong to K' or not, 'Withoutknfn~";ng the re,sult oj this partic'lilar observatiort.2 It does, however,seem difficult to give a precise mathematical meaning to thecondition printed in italics, and the attempts to express thesecond axiom III a more rigorous way do not, so far, seem to havereached satisfactory and easily applicable results. Though fullyrecognizing tIle value of a system of axioms based on the properties of statistical frequencies, I think that these difficultiesmust be considered sufficiently grave to justify, at least for thetime being, the choice of a fundamentally different system..

The underlying idea of the system that will be adopted here1 For the history of these attempts, cf. Keynes (1], chaps. VII-VIII.

2 The second a.xiom as gIven by v. Mises [31, p. 18, is som.ewhat more com..plica.ted. It can, however, be shown tha.t this is equivalent to the simpler statementgiven a.bove.

INrtRODUCTORY REMARKS 5

may be roughly described in the following simple llray: Theprobability ofan event i8 a definite numbera880ciated with that etrent ;and GU'! ~rcio'tn8 have to express the fundamental rules for operatio1J.,8with 8U.ch numbers.

Follov,ing Kolrnogoroff [4]~ we take as our starthlg-IJoint theobservation made ahnve (of.. (2) that the probabihtj" P (8) nlaJ"be regarded as an additi1)P f1Jfl~Cti07l, oj the 8et S. We sh$ll1. ill fact,content ourselves by Fiostu.lr..ting mainly the existenro~ of afunction of this type, defined for a cert~in fal=luy of set'S S ill thek ...dimensiollal space R" tc, which our "\'T"ariahl~IJloint )( is rc~tricte(l,

and such that P (8) denotes the prnbability ofttle relation -it:cSThus the question of the validity of the relation (3) \\ilI 110ti

at all enter into the mathematical theory. For tele eral,iricalverificutio1~ of tIle theory"' it ~ill, on the other hand, becon1e amatter of fundanlelltal il~l?(.frtance to know if, in a gi,,"en case,(3) is satisfied \vi i h a I)ractically sufficient approximation.!Questions of verification and application fall, however, outsidethe scope ofthe present work, \vhich will be exclusively conc:erlled"\vith the development of the pllrely Dlathematical J>aI't of thesubject.

3. Before giving the explicit statemellt of our axioms, it ,villbe convenient to discuss here a fe\\r preliminary questions relatedto the theory of point sets and (generalized) Stieltjes integrals inspaces of a finite number of dimensions~2

In the first place, we must define the falniIJT .F of sets S, forwhich we shall want Otlr additive set function P (S) to be given.If X =('1' ..... 'I ~k) belongs to the k..dirl1ensiollal euclidean spaceRk , the family F should obviously co:ntain every k...dilnel1sionalinterval J defined by inequalities of the fornl

ai<~1,~bi (i=1,2.. ... ,l"),as we may always want to know the 1)robal>ilit

1y of tIle relation

1 Cf.. Ca.ntelli (2], Tomier [:l].. The foundations of the theor..r l);S l~lld down byt!les~ authors present eertam analogies lvith the principles h-ete used.

2 Reference may be made to tb~ treatises by Hobson (I), Lebesgue [11 and de- La.'-alice Poussin [1]"

6 INTRODUCTORY REMARKS

Xc J. It is also obvious that F should contain every set S constructed by performing on intervals J a finite numberofadditions,subtractions and multiplications. It is even natural to requirethat it should be possible to perform these operations an infinitenumber of times without ever arriving at a set S such that thevalue of P (8) is not defined. Accordingly, we shall assume thatP (8) is defined for all Borel 8et81 S of Rk -

The family of Borel sets consists precisely of all sets that canbe constructed from intervals J by applying a finite or infinitenumber of times the three elementary operations. If 81, 82) •••

are Borel sets in Rk , this also holds tt1le for the two sets

lim sup Sn = lim (811, +Sn+l + ...),lim inf Sn=lim(SnSn+l..• ).

If fun sup 8n and lim infBn are identical, we put

lim Sn = lim sup Sn. =lim infSn'

and thus limSn is also a Borel set. In particular, the sum andproductofan infinite sequence ofBorel sets are always Borel sets.

If no two of the sets Si have a common point, it follows fromthe additive property (2) that

P (81+ ..... + Sn) == P (~)+ .... +P (8,.,)

for every finite n. Since the limit 81 + S2 +... always exists andis a, Borel set, it is natural to require that this relation shouldhold even as n -+ 00, so that we should have

P(81 +S2+ ...)= P(St)+P(S2)+ ...•A set function with this property will be called completelyadditive, and it 'Will be assumed that the function P (8) is ofthis type.

Consider now a real-valued point function g (X), defined forall points X = (~1' .... , 'k) in Rk* 9 (X) is said to be mea8'Urable B2if, for all real a and b, the set of points X such that a < g (X) ~ bis a Borel set. Similarly, a vector function Y =!(X), where

1 cr. Hobson (1]~ I, p_ 179; Lebesgue [1], p. 117; deIa Vallee POl18sin (1], p. 33.:I Cf.. Hobson (lJ, It p. 563; de 1& Vallee P0US81n [1], p. 34.

INTRODUCTORY REMARKS 7

y = (7Jl' ••• , 7]t) belongs to a certain f·dimensional space ffir, ismeasurable B if every component 11£, regarded as a functionof X, is measurable B. If <5 denotes any Borel set in Dlr, and if Si-s the set of all points X ill Rk such that f (X) c6, then loS is alsoa Borel set. (If f(X) never assumes a value belonging to $,~S is of course the empty set.) IfJl'!2' ... are measurable B, so are

11 ±f2' Il!2' fl1, limsuPfn' liminfj", and, in the case of convergence, limfn.

All sets ofpoints with whick we shallltave to deal in the sequel areBorel 8ets, while all point jU'Mtio'lUl are measurable B.. Generallythis will not be e:eplicitly mentioned, and should tltetL always betacitly ulUierstood.

A Lehe8gue-Stieltje8 integral with respect to the completelyadditive set function P (8) is, for every bounded and nonnegative g (X) and for every set S, uniquely defined by tIlepostulates

(A) J gdP=J gdP+j gdP,~+~ ~ ~

81 and 82 having no common point, and

(B) f (gl +g,JdP=j UldP+J g2dP,s s s

(0) fsgdPi: O,

(D) f 1.dP=P(S).i~

Ifg is not bounded, we put g.ll = min (g, M) and define f gaP.. s

as the limit of Isg.lldP as M -+ 00. If the limit is tinite, g is said

to be integrable over S with respect to P (8). The extellsion tofunctions g which are not ofconstant sign is perfornled by putting

2f gdP==J (lgl+g)clP-J (lgl-g)dP.8 S S

8 INTRODUOTORY REMARKS

FarallY g such that I9 I< G throughout th.e set S, vre then I1avethe Inean value theorem

IfsgdP! < G P(S}.

I:,et gl' 92' . _. be a sequence of functions euch tllat for all pointsof sS we have Ign i <g, where g jg integrable. Then iflin19n exiRt.sfor every point of S, except possibly for a certain set of pOintsS'1 c S such that P (81) = 0, we have

,.. f"

Iiln I fJ7t(lP=J~ uIngrc,dI"."'S s

It follows tliat the theorer.as on co:::rt.inuity, differelltiation a:u(lintegration \nth respect to ~'t Iiarameter, etc. ,,"hich are 1\:110\Vnfrom elementa,ry int-egratior! theory' extend. thelneelYes im...

mediately to integrals of t!--:.€ tYf:e Jr g{X,t)dP, where t is as

parameter..TIle ordinary theoretl1s on repeated i:ntegra,Is 1 a·re also easil;y

eA~e:nded to integrals of the type here considered.. III ]::;artieular,ve l1ave the following result whicl~ ,vBl be used in ChaI:ter III.

Let P (is) be denlled in a, t"~vo-dimer~i.()nal space R2 and suchtllS:t for every two-dirl1eDJ~ionalinter~t~.lJ ((.£1 < gl ~ OJ.' a2 < ~2 ~ &2)

we na,"'e P I J') - P. (J ') R ;J \\<- - 1 1 2 \ 2i'

where PI (S) and ~ (S) are completely a.ddit.ive set functions inH1 ~1hile J i denotes the one-dinlel1siol1al interval a1 < 'i ~ 0i.'l1l.etl if the function g1 (eli fJ2 (~2) iR integrs..ble over Rs ""ithrespect to P(S), we have

f Ydgl}YS(gs)dP=J gl('I)dP1f g2('2) dPt.-~~ ~ ~

1 Cf. Kobson (1].. I, p. 626; de 180 Va.llee Pou3sin (1], p. 50

CHAPTER II

AXIOl\fS AND PRELIMll{ARY THEOREMS

1 ~ \)7e now proceed to the explicit st.atement of our axioms.. 1

In accordance with the preceCUng chapter, we denote by RJ;

a k-dimensional euclidean space ,,--rj'tti the variable POllltX=('1,~?C,ek)1 and we consider the family of all Borel sets SinBk "

Axiom 1. 110 every S corre8po:nda '-1; non-negative nurnberP (ij), which is called tke probability of the 1celation (01" eve-nt) XeS.

~~xiom2. 117e have P (Rk ) = 1a

Axiom 3* P (8) ia a completelyadditi've8etf'l//nction, i.e. wenat'e

P (S1 +/32 + ... )= P (81) + P (82) -i- ••• ,

where li1, S2~ .0 .. are Borelaets, no tu.10 of1.vh·ic!t luzve a cornmon poin.t.

The variable point X is th.en called a randmn 1Jaria·ble (orrandom point1 random vector)o The set functio:n P (8) is calledthe probability !ufWtio?lI of X, and is said to define the probabilitydiatrilYutiO'n in Rk which is attached to the variable X. It is oftenconvenient to use a concrete interpretation of a probability dis··tribution as a distribution ofmass ofthe total amount lover Rk ,

the quantity of mass allotted to any Borel set 8 being aqua!to P (ls).

It follows ilnmediately from the axioms that we always have

O~P(S)~1,

and P(S)+P(S*)=l,

where Sand S* are complementary sets. Further, if 81 and 82are two sets such that 81 :>192, ~Te have 81 = 82 + (S1 - Sf) and thus

(4) P (Sl)'~ P (S2).

1 The:f&ct that we restrict ourselves here to Borel seta in BIt: permits some formalaimplifieation or the system of moms given by Kolmogoroff [4:If and or the im-mediate conclusions drawn from the axioms. .

B

10 AXIOMS AND PRELIMINA.RY THEOREMS

Theorem 1. For an'll 8equence oj Borel sets 81, S2' · (0. in Rk,

we have P (lim sup Sn) ~ lim sup P (8",),

P (lim inf Sn) ~ lim inf P (Sn)·

Hence, if lim 8,,, exi8tB, 80 doe8lim P (Sn)' and we have(5) P (lim 8n ) = lim P (Stt>.

In order to prove this theorem, we shall first show that (5)holds for any morwtcme sequence {Sn}. If {Sn} is arl increasingsequence, we may in fact write

lim8n =Sl + (SZ-Sl)+ (83-82)+ ... ,

and thus obtain from Axiom 3

P(lim8n)=P (81)+ P(Ss-&)+P (83 -82)+ ...

== p (St,) + (P (S2) - P (81)) + (P (S3) - P (82) + ...=limP(Sn,).

For a decreasing sequence {Sn}' the Bame thing is shown by con..sidering the increasing sequence formed by the complementarysets S:.

For any sequence {Sn}' whether monotone or not, we have(of. I, §3) limsupSn,=lim(Sn+Sn+l+ ... ). Now, Sn,+Sn+l+ .... isobviously the general element of a decreasing sequence, 80 that

(6) P (lim sup 81'1,) = lim P (Stl, +8n+1 +... ).For every 1'==0, 1, •.. , we have 8ft,+8110+1+ ... j Sn+" and thusby (4:) P P S(Sn+Stt+l + )~ (n+,),

P (S,,+Sn+l + )~limsupP(Sn,)·

We thus obtain from (6)

P (lim sup 8n ) ~ lim sup P (S,,).

Hence the inequality·for P (lim inf8ft) is obtained by consideringthe sequence {S:} of complementary sets and using the identityliminfSn. := (lim. sUp S:)*. Thus Theorem 1 is proved.

In the particular case when every point X of Bit. belongs atmost to a finite number of the sets 8ft , lim 8" is the empty set,and it follows that we have lim P (811J== o.

AXIOMS A.ND PRELIMINARY THEOREMS 11

2. Consider BOW the particular set 8:;el'X2'.· .'x1c defined by theinequalities

(7) e'£ ~ x, (i = I, 2, ... , k).

For all real values ofthe XI, we define a point/unction F (Zl' ... , xk)

by puttingF (Xl' .•. ,. Xk) = P (SZlf ....Xk)'

so that according to Axiom 1 Jf' (Xl' .... , :Ck) represents the probabilityofthe jointexistenceoftbe relations (7). ThenF (Xl' ... ,Xk)is called the diatribution fun.ction1 of the probability distribtttiondefined by P (8).

In the sequel, the terms probability functicm and diBtrib1ttionfunction will usually be abbreviated to pr.f. and d.f. respectively.

Let J denote the half-open lc-dimensionaJ interval defined bythe inequalities at < ,;;~ bi for i =1, 2, ..... , k. The correspondingprobability P (J) is then easily seen to be given by the k..th orderdifference of the d.f. F (Zl' -... , Xk) associated with the interval J.We thus have, writing only the first and last terms of the expres...sion for this difference,

P (J) = i1kJJ' (~1' •. -, xk)

=F(b1, ••• ,bk )- .•• + (-I)kF(a1, ···,t.Jk)·

Theorem 2. Every d.l_ F (Xl' "•• , Xk) P0lJ8eB8eJJ the followingfW°perlies :

(a) In eacA fJQ,riabk Zl' F i8 a never decreasing function, whick~ everywhere oontifl,1.WU8 to the right and, tends to tke limit 0 a8

X,~-CX).

(b) As all tke variables Xi tend (independently or not) to +00, Fterul8 to tke limit 1.

(0) For any kal/-open k-dimensional interval J, the aB8ociated,Je ..th order tJ,i!/erence of F is non-.negative.. i.e. Ai;F~ o.

.Further, every Junction, F (Zl' ... , xk) which po88e88e8 the pro..pertiea (a), (b) and (0) determi,'IUUl uniquely Q, probability distribution in Rk, IJ'UC1I, that F repreatnt8 the probabili,y of the relations (7).

1 The use here made of the terms probabUitg !undio. and .ilerihtlcm. j'll/lICti.oncorresponds to the terminology of Kolmogoroff' [4]. The latter term was used, withthe same significance, already by v. Mises [I" 2].

12 AXIOMS AND PRELIMINARY THEOREZ,iS

That F is a never decreasing function of each Xi follo'W"s im...mediately from (4), since the set 8.1:1, ... ,:tk increases steadily wit}}each Xi- Further, we have for every k> 0

F (~+}z.,X2'." ,Xk) - F (X1,X2, ...... ,xk) = P(8::Cl+h,~J,'''':Xk -l:JZ1t;:lJfJ ...,:rk).

Ifk runs t~ougha sequence ofvalues tending to zero, thesequenceof point sets appearing in the second member obviously tends toa, definite limit, viz. the empty set. Thl18 by Theorem 1 the firstmember tends to zeros and F is continuous to the right in .xl.The same argument evidently applies to every Xi' In the sanleway it is se-an that F tends to zero as any given xl~-OO, sincethe set SXl

te.' (lJk tends then to the empty set.

As, on the other hand, all the variables Xi tend simultaneouslyto +00, the set S:&l, ...,Xk tends to the whole apace RIc' ;j,Ild consequently F tends to the limit III

Further, it is obvious that any d.£ will satisfy the property(c), as we must have P (,.,9) ~ 0 for any Borel set S.

trhe last part of Theorem 2, which asserts that every d.f.Ilniquely determines a non--negative set function P (8) satisfyingour axionls, is equivalent to a well-known proposition in thetheory of Lebesgue integration. l We have already seen that thed.f.. immediately determines the value ofP (8) for every half-openk...dimensionai interval ai<'i~bi {i=i,2, ... ,k)c Now ever~7

Borel set can be constructed from such intervals by means vfrepeated passages to the limit, and the corresponding value ofthe set function has then to be determuled according to (5). Thatthis procedure leads to a uniquely determined result for everyBorel set is preciselyasserted by the proposition. referred to abo","e.

AccoTding to Theorem 2, we are at liberty to define a pro...babilit,y distribution either by its pr.f.. (which is a lJei function)or by its d.f. (which is a point function). Though of course thedistinction between lihe two methods is only formal, it willsometimes be found convenient to prefer one of them to the

1 Le"besgue [IJ, pp.. 168-169 (one..dimensiona.I case); de la VaU6e Polt88in (1],cha.p.. VI.

AXIOMS AND PRELIMINA.RY THEORE~!S 13

other.. It is particularly in the case of distributions in a onedimensional space (k= 1) that we shall use the d ..f., while forgeneral values of k the pr.f. will be used$

In the one-dimensional case (k =1), the propert.y (e) is impliedby (a), and thus it follows from Theorem 2 that every nondecreasing funotionF (x) which is aiways continuous to the rightand is SUO!l tllat F(x)->O as 3:-i'--CO, and F(:c)-+-l as 3:~+OO,

defines a prolJability distribution.. As soon as k> 1, however, (c)is no longer implied by (a), and already for lc=2 it is ill fact easyto construct e~:~\mplesof funotions F satisfying (a) and (b), butnot (c). Accorcling!y, these functions are llot distribution functions..2

3. Let X=(~l' ... ,eft ) be a random variable in R1c "«"itll tllepr.f.. P (8), and Y =,f(X) = (1]1' • •• ,11f) be a B-mea.surable fttnQtionwhich is nllite and uniquely defined for all points ..t.Y. of Rk ..

and such that its values belong to a. certain space ffira TheIlif 6 is a Borel set in fflt , the set S of all poirlts X in Rk , SItch thatIT=!(X)c'S, is (cf. I, §3) also a Borel set.. If, 110"''', we define aset function ~ (6) in 31t by the relation

~ (6) = P (~g),

it is readily seell that our Axioms 1-3 are satisfied by ~ ((5), sothat ~ (6) determines a probability distribution in mt~ Thi8~ bydefinition, is the probability distribution of the random variableY = J(X). The condition that f should be finite and 1.miquely(lefined for all points ofRk may obviously be replaced by the m.oregeneral condition that the points X, whereJ is not finite or notuniquely defined, should form a set i:: such thap P C£) = o.

For a set <; such that the corresponding set S contains no pointX \ve obtain, of course ~ (6)=0.

Take, e.g.) Y = ('i~ .... ,gf), wllere f < k, so that Y is simply theS A mmple example is the funotion F (a:. y) defined by F =0 for x<l, y<l, for

:t~l, 11<0, and for .z<Ot y~l, and by F=l elsewhere.. For this function,. thedifference ~1' aasocia.ted with it sufticienifty sma.ll interval J containing the pointx==y== 1 in its interior is seen to be negative, so that (c) is not sa.tisfied.

14 AXIOlIS AND PRELIMINA.BY THEOREMS

projection of the point X on a certain f-dimensional sub...space Rr •The pr.f. of Y is then ~ (6) = P (8), where S is the cyZinder 8et inR" defined by the relation ('1' .•. t'I)CS. This may be concretelyinterpreted by S&ying that the distribution of Y is formed by projecting the mass in the original distribution on the sub-space Rr •

In particular (I =1), every component ,,, of.x is itself a, randomvariable, and the corresponding distribution i$ found by projecting the original distribution on the axis of 'i.

4. Two random variables Xl == (~1' .... ,ek1) in Riel andX 2=(1]1' .... , 11k,.) in RksJ>eing given, it often occurs that we have toconsider also the" combined" variable I =(Xl' X 2) as a randomvariable. The "values" of I are all pairs of "values" of Xl andXs, so that it =(Xl' X a)= ('1' .... , 'k1' 1'Jl' .... , "IJkJ ) is defined in theproduct space Sif == Rlcl "Rkt, where f:= k1+k2" Obviously theprobability distributionof! in atmust be such that its projectionson R' and RU coincide with the distributions of Xl and Xsrespectively.l Similar remarks apply to the "combined" vari...able formed with any number of random variables.

Let the probability functions of Xl' XI and £ be ii, P2 and ~,

while the corresponding distribution functions are F1, Fa and g..Then F1 (Xl' •• ",. Xk

1) and Fa (111) .... , Yks) denote the probabilities

of the relations t <,.. (.: - 12k )Si = ""'I. II - , , ... , l'

and '1:J~Y; (j=-I, 2, ••.,ks)

res~ctively, while ~ (Xl' .... , :J:k1

' Y1' .. -.,1Ika) denotes the probability of the joint existence of all these f = kl + karelations.

We now introduce the following important definition: Thevariables Xl and X 2 are called mutually independent if, for allvalues of the Xi and Yi J

(8) tr (Xl' ••. , Xk1' Yl' ... , Ykj) ==.1;. (Xl' •.. )Xk1 ) F2 (Yl' •. -,11k.)·If 81 and S" are given sets in R' and Bit respectively, and ifwe

consider the set S in mformed by all pairs I =(Xl' X.) such that(9) XlcSI and XscS",

1 Any distribution satisfying thia eoDditioD U. of C011l'8e, logieaDy poesible.

AXIOMS AND PBELIMINA:RY THEOREMS 15

then it follows from (8) by the basic property of Borel sets that'(6) =P1 (Sl) Pi (S.).

Thus for two independent variables the probability of the jointexistence of the relations (9) is equal to the product of the probabilities for each relation separately. The validity of thismultiplicative rule for the particular sets connected with thedistribution functions is thus equivalent to the validity of thesame rule for all Borel sets.

5. Let Xl' XI' ... , X n be random variables with the pr.f.'s.~, ... ,Pn and the d~f.'s lfi, ...,Fn defined in the spaces B', ..., H:n)of any number of dimensions. Consider the combined variable£11. = (Xl' .. *' Xfl) with the pr.f. '71. and the d.f. fYn' defined in theproduot space ffi,<n).

Xl' ... , Xft are then called mut'Ual''ll indepe:ndent if

trn=F1F" ... Fn,which is the straightfonvard generalization of (8). As in the caseof two variables, this is equivalent to the relation

(10) 'n (Sn)=Ii (81) ••• P", (8n ),

where 81, •• _, S.,., are given sets in B', ••• , :en) respectively, while endenotes the set in m<n) which consists ofall points £$ =(Xl' .• -, X no)such that XiC~ for i == 1, 2, .•• , no

If, in (10), we put Sn,=Rn), we obtain

'n-l (~n-l)== Ii (81) ... P~-l (Sn-l)'

where ''''-1 is the pr.f. of In-l =(Xl' · .. , X n- 1 ) and <Sn-l is thesetofall points In-l in 8l,<n-U such that Xi C8"for i = 1, 2, ... , n - 1.Thus we infer that the variables Xl' ... , X n- 1 are independent,and in the same way we obviously :find that any g'TOUp 01 m ~ namung the variable8 X, are mututJUy independent

Further, it is easily found that, if the variables Xl' ... , Xm,

1;" •••,Yn are all mutually independent, then the combined vari...abIes lm == (Xl' · .. , X m) and Wn =(1;, .. 0'1';,,) are also independent.

6. Any B-measur&ble vector function I(X1, ••• , X m ) o£ mrandom variables may be considered as a B-measura.ble functionof the combined variable !m. Thus according to II, §3, the pro-

16 A.XIOl\lS A:s"n PRELIMIKARY THEOREMS-

bability distributIon of f is uniquely determined by the diStribution of xm•

Theorem 3.. Let Xl' ... , X m, Y;" •... , Yn be independent rand07nttariabletJ, and j,X1, ... ,Xm,) arJ g(Y;,. c .. ,YnJ be B-measurablevectorfU/lwtion-s of the a.<:,'Jigned argurnenf,a.. Then!and gare mutuallyindepe:nde:nt ra'fUiqm va,riablea..

We ha.ve J= f(im) and g= 9 (IDn), where, according to thepreceding paragraph, I m and %)tt. are independent. The pro..bability thatfbelongs to a given set S is then, by definition, equalto the probability that ;Em belongs to the set of all points which,by the relationj=f (!~~), correspond to values offbelonging to IS.For g, and for the combined variable (J, g), the 8JnalogouB relationshold.. The independence of f and g thus follows from the in..dependence of re~trt and IDn·

7.. If X and Yare Iandom variables with the pr.f.'s P (S) andQ(T), ,vhere Sand T are variable sets in the spaces of X and Yrespeotively, and if the prltf. of the combined variable (X, I'")is kno~'n, ,ve can form. the probability of the joint existence ofthe relations Xc ;.S, Y c To This is a functioll of t,ro variable sets,

say G (~J, T). Now let us consider the expression

(11) p, (S)_~(S, T)T - Q(T)

for a fixed set T such that Q(T) > 0.. Then Pp (S) becomes afunction ofthe variable set S, and it is immediately seen that thisfunction satisfies Axioms 1-3. For ever:v fixed T in the space ofY such that Q(T) > 0: PT (8) thus defines a probability distribution in the spae.e of X. This distribution is callbd th£ 'P"'obabilitydi8tri&a.jticm of X relative to the kypothesi-s Yc,T, and the quantityPT (;.~) is known as the probability of tke relation (e1Jent) XeSrelalivc to tke hypotliJW YeT <# Similarly, we define a distributionin the space of Y relative to the h:Y'Pothesis XeS:

G(S,T)(12) Qs(T)= P(S)'

If J.91, .... ", 8'17. are such that {~'I. and 81: have n() common point for

AXIOMS AND PRELIl\fINAR Y THEOREMS 17

't:F k, while 81 + » •• +Sn, coincides with the whole ~ace of X, weobviously hav·e by (12)

Q(T) = G(Sl' T) + ... +G (8ft' T)= P (81) QS

1(T) + ... + P (Sn) QSn (T),

and 80 obtain from (11)P

T(S~) = : (S$) Qs. (T) .

~ P(Si) QSi (T)t.=1

This relation is known under the name of Bayes' theorem, and isconsidered as giving the probability a posteriori, Le. calculatedafter the '''event'' Yc T has been observed, of the particular·'hypothesiet" Xc 81. when P (81), .•• , P (Sa) are the a prioriprobabilities of the various hypotheses X cSt.

If X and Yare independent variables, we haveG (8, T)=P (8) Q(T),

and thus by (II) and (12)

Px(/S)=P(S), Qs(T)=Q(T),

so that in thi'J case the relative probabilities coincide with the.: total" probabilities P (S) and Q(P).

SECOND PART

DISTRIBUTIONS IN R 1

All random variables and distributions considered in this partare, unless explicitly stated otherwise, defined ill a one-dimensional space R1•

CHAPTER III

GENERAL PROPERTIES$ MEAN VALUES

1. According to Theorem 2, the d.f. F (x) of a. probabilitydistribution in~ is always a non-decreasing function of x, whichis everJ"Where continuous to the right and tends to 0 as x-+-- 00,

and to 1 as x~+co. Conversely" any F (x) with these propertiesdetermines a probability distribution.

Any d.f. F (x) being a monotone function, we can at once statea number ofgeneral properties of ....11' (x), for the proofs ofwhich werefer to standard treatises on the Theory of Funotions of a RealVariable.!

Theorem 4. A d.f. F (x) lUJ8 at most a finite number of pointeat which the 8altUB i8 ~ k > 0, and COfUIequenfly at most an enumer·able aet of pointB of diacominuitll.. PM derivative F' (x) exists/or"alWKJ8t all" val'Ue8 of x (i ..e. the pointa oj e:xception form a 8et ojmeaB'Ure zero).

F (z) can always be represented as a 8um of three components

(13) F (x) =aIF1 (x) +aIIFII (z) +alIIFIll (x),

where aI' an' alII are non-negative numher8 with the 8'Um 1, whileF1, FII) FIll are distributionfu:ncti01UJ BUCk that ..·

PI (x) is ab80lutely continuous,' FI (x) = f: 0:)F~ (t) dt for aU valU68

0/$,.1 Hobson [I], 1, p. 338 and p. 603.

GENERAL PROPERTIES 19

FII (x) is a "8tep1unction" ; F1I (:c) = tke 8um of the 8altu8e8 ofF (x) at all discontinuities ~ 3:.

FIll (x), the '~aingular" component. is a continuO?Ul functio11thaving almost 6vergwkere a derivative =o.

The three components aIF1, aIIF1I' aIIIFIII are u'fl/iquely determined by F (x).

Let us consider in particular the cases when aI or all is equalto l~ so that J! (z) coincides with F1 (~) or FII (:tt Wp sha,ll say inthese C&SeS, which are those usually occUlTing ilL ~h.. applications,that F (~) is of type I or n respectively.

I. H F (x) =F1 (x), we have for all values of x

F (~)=J:oo F' (t)dt,

and thus tJte probability that the random variable X with thed.f. F (x) $lSume8 & value belonging to the given set S is

,.Js F' (t)dt.

The derivative F' (x) is then called the frequency function orprobability density of X.

II. If F (x) = Fn (~), there is a finite or enumerable set ofpoints z, such that every Xi is a point of discontinuity of F (x),while F {x} is constant on every closed interval which containsno ~;:. IfPi, is the saltus of F (x) at the point Xl' we have LP! == 1.,The probability that X belongs to the given set S is zero, if Sdoes not contain any 3:" and is otherwise equal to the sum of allthose p, which correspond to points Xi belonging to S .. Thus inthis case the distribution is completely described by saying thatwehave the probabilityp", that X assumes the valuez, (i ::= 1, 2, .,. ~ )and the probability 0 that X differs from all tIle Xi.

2. A Lebesgue·Stieltjes integral fgdP with respect to the

pr.f. P(S) has been defined in 1, §3. 'Ve now define tIle cotte-

respectively.

20 GENERAL PROPERTIES

sponding integral with respect to the d.fa F (x) simply by putting

fs gdF =fs UdP.

If X is a random variable with the d.f. F (x), the integralf: xdF (x) has a uniquely determined value for all finite a and b.

If this integral tends to a finite limit as a -;. - 00 and b-;.. +00

independently (i.e. if the integral is absolutely convergent), wedenote this limit byl

(14) E(X)=J:", xdF{x)

and call it the 1nean value or ma~'lnatical expectation of therandom variable X ..

A B ..measurable function g (X) of X may, according to n, §3t

be considered as a random variable. If the d.f. of this variableis denoted by F* (x), we have by n, §3,

F* (x) =J dF (t),s::

where Sa; denotes the set of all points t such that g (t) ~ x. Thus

,,-e obtaJ.n fb f· a xdF*(x)= g(x)dF(x),

where the integral in the second member is extended to the set811 - Sa- Now if the integral

f:a> Ig(x) IdF (x)

is convergent, we may allow a and b to tend to - 00 and +00, and80 obtain aocording to (14) for the mean value 0/ g (X)

(15) E{g{X)}=f:a> 9 (x) dF (x).

In the same way we obtain, if9 (I) is &. real-valued function afa random variable I which is defined in a space mr of any number

1 In the partionlar ea.se8 when F (z) is of type I or type II" we have

E (x>=j"CO ::r:F' (x)dr and E (X)=Ep$X'i-~ j

GENERAL J:lROPERTIES 21

of dimensions,

(15a) E{g(~)}=J~/(~)d~,

where ~ (S) is the pr.f. of £, and the integral is assumed to beabsolutely convergent.. If, in particular, g (~) depends only 011 acertain number k < f of the co-ordinates of ~'t the integral is, byII, §3, directly reduced to an integral over the correspondingsub-space Stk "

The mean value of the particular funotion (X -E (X»)! iscalled the variance of X. The non-negative square root of thismean value is called the 8tandard deviation (abbreviated s.d.) ofX and is denoted by D (X), so that we have, assuming the convergence of the int\egral,

(16) D2(X)=f~«> (X-E(X»2dF(x)

= E (X2) - E2 (X).

The square root D (X) is always to be given a non-negative value.We have D (X) = 0 if and only if F (x) is constant on every

closed interval which does not contain the point x= E (X) .. Inthis extreme case, we have the probability 1 that the variable Xassumes the value E(X), and we have F(x)=e(x-E(X»,where f: (x) denotes the particular d.f. given by

(17) E(X)= {O for x<O,1 " x~O.

In all other cases, the standard deviation D (X) is positive.If X is a random variable with a finite mean value, we ob

viously have by (15)

(18) E (aX +b)=aE(X)+b

for any, coIlBt-ant a ftJ}d b. Further, if the s.d. is aJso finite, wehave /

(19) D(aX+b)=)alD(X).

In particular, the normalized variable X;; 'fi;> has the mean

value 0 and the s.d. 1.The moment8 «p and the ab80lute mome:nts p." of the variable X

22 GENERAL PROPERtIES

are the mean values of Xv and f X p' for v=0, 1, 2, ... :

Gt,,= f~1Xl x"dF (x),

fl,,= f:t1J Ix l"dF (x).

\/Jp is, of course, hereby defined also for non-integral v> 0.) Itis immediately seen that, if 13k is finite, both «v and fJv are finitefor v~ k. Further, we have «211 = /32v and f tX2v+l I~ f32v+l- From(14) and (16) we obtain

E (X) = eX1' D2 (X) = tX2 - (XI.IfPk is finite, it follows from well...known inequalities1 that we hava

1 1 1

(20) Pl~ fi:~ ,e::;i ... ~ p~ ·In the sequel it will always be tacitly understood that the mean

values occurring in our considerations are assumed. to be finiteeven in the rigorous sense that the corresponding integrals areabsolutely convergent.

3. Theorem 5.2 Let t/J (x) denote a non-negative function Bucktkat if; (a-) ~M > 0 for all x belonging to a certain 8et 8. Then if Xis a random variable) tke probability that X Q,88Ume8 a value

belo1Iging to S i8 ~ E{~X)}.

This follows directly from the relation

E{!f1(X)}=f:<c!f1(X)dF(X)~M fsdF(X)=MP(S).

Taking here in particular "p(x)=(x-E(X»)2, M=lcl , weobtain for every Ie> 0 the Bienayme-TcAebyehefJ inequality:

The probability of tk relation IX - E (X) I~ /c is ~ [)I/c\X) •

Taking further t/J(x)= Iz I", M ==Jcvftll' it follows that the

proba.bilityof IX I~k\lfJ.. is ~ ~.1 Cf.. Bardy...Littlewood.P61ya [1], p. 157..I This is an obvious general.iAtion of theorems due to Tchebycheff and Markoff.

Of. Kolmogoroff{'], p. 37.

GENERA.L PROPERTIES 23

Ohoosing finally .p(:r:)=e=, M=eca, where c>O, we conclude

bili. fX · E (tcX

)that the proba ty 0 ~ a IS ~ eca- •

4. Let X and Y be random variables in R1, such that thecombined variable Z = (X, Y) has a certain pr.f. P (S) in Rs•Then X + Y is a one-dimensional vector function of Z, whichaccording to II, §6 has a distribution uniquely determined by,.P(S). By (loa) we then have 1il (X + Y)=J - (X + Y)dP. The

Rs

integrals f XdP and f yap reduce, however, according toEll R t ..

the remark made in connection with (15a), to the one-dimen-sional integrals representing E (X) and E (Y). As soon as thesetwo mean values exist, we thus have the important formula

(21) E(X+Y)=E(X)+.E(Y),which is evidently hereby proved without any assumption concerning the nature of the dependence between X and Y. Obviously (21) is immediately generalized to any finite numberof terms.

Treating in the same way the product XY, we obtain

E(XY)= fRo XYdP.

If, in particular, X and Yare mut'Ually i,ruJ,epeMent, we have byII, §4:, P=liP2, PI and P2 being the pr.f.'s of X and Y. It thenfollows from I, § 3, that, if X and Y are independent, we have1

(22) E(XY)==E(X)E(Y).

From (16), (21) and (22) we obtain further, if X and Y areindependent,

(23) DJ(X + Y)=DI (X) +DI(Y),whichisimmediately generalized to anyfinitenumberofmutnallyindependent terms.

1 If we restrict ourselves to variables with finite variances, the ~ry a.ndnJ/ici.t:"" condition for the validity of (22) and (IS) is that the correlation coefficientof X and Y vanishes..

CHAPTER IV

CHARACTERISTIO FUNCTIONS

E (XY)= E (X) E (Y).

1. The mean value of a real function g(X) of a randomvariable X has been defined in ill, §2. For a complex. function9 (X) +ih (X), we put

E(g+ih)=E(g)+iE(k)= f:oo (g+ik)dF(x).

With this definition) the rules for operations with mean valuesgiven in the preceding Chapter hold true even for mean valuesof com.plex functions. If, in particular, X and Y are complexfU:lctions of mutually independent variables, we obtain from (22)and Theorem 3

!j(t) I~f~oodF (x) = 1.

The variable aX+b has the d.f. Fe~b) and the c.f. eJ1il/(at).

TIle mean value of the particular function e"'x, where t is anauxiliary variable, will be called the iharacte:ri8tic f'Undion(abbreviated c..f.) of the corresponding distribution.1 Denotingthis function by f(t), we have

(24) j(t)=E(e(tX)= f:oo eiktJdF(x).

Unless explicitly stated otherwise,/(t) will be considered forreal value8 of t only.

The integral in (24) is absolutely and uniformly convergentfor all real t, so that f \t) is uniformly continuous. Obviously1(0)=1, and

1 The first use of an analytie&1 IDstrument sub8tantJ.&lly eqmvaJent to thecha.racter1st1e funciion seems to be due to Lagr&nge (1]. (Cf. Todhunter [IJ, pp.309-813.) SImilar mnctJ.oDl were then sy&tema.t1ca1ly employed by Laplace m hiegreat work [1].

UHABA.CTER1STIC FUNCTION~ 25

Thus in particular, putting E (X)=trt andD (X)=a, the normal-X -,n _mil (t)

ized variable (of.. Itt, §2) -0- has the c.r. e a f ;;, . Further,

the variable -x has the c.f.J< -t}=l(t)~

Theorem 6.1 For every real, the limit

lim 21TfT !(t)e-ttedt

T-+«J -T

exi8t8 and i3 equal to the 8alfu-8 of F (x) at x=e.. TJt'U8 if F (x) i8continu0'U8 at x=" the limit is zero..

We have

.I'11fT l(t) e-i~dt= ITJ"'1 ,dt foo eil(;r-tidF (x)2-d. -I 2 -':1 eI-QC

fOJ Slll Tx

= -TdzF(x+f).~ -00 x

The contribution to the last integral which is due to the domain1x I~ 11, > 0 tends to zero as T ~ 00, whatel'"er the value of h~

Let k be so chosen that the variation of F(x) in (g-h,~+h)

.. sin Txexceeds the saltus at x = eby less thall E,. Then, smce -T = 1

~ x

for x=O, and I~;XI~1 always, it 18 seen that, for all suffi

cientiy large T, the last integral differs from t.!le saltus of F atuhe point, by less than 2t!o Thus the theorem 1s proved.

Representing F (x) as the sum of three components acoordingto (13), we have

let) =arfI (t) +a1IfII (t) +aIII/III (t),

each term containing the o.f.. of the corresponding component ofF (~). We shall consider the behaviour of these three termsseparately.

I. Since Fy is absolutely continuous, Jl (t) = f:ooe1i:l:Fi(X)dx,

1 Bochner [1]. p. 79.

26 CHARACTERISTIC FUNCTIONS

and hence fI(t)~O as f t I~OO, by the Riemann-Lebesgue

1 fTtheorem.1 It follows that 2T -1'1 Idt ) Illdt~O as T ~oo. If the

nth derivative Ftn)(x) exists for all:c and is absolutely integrable.

a partial integration shows thatII (t) = 0 Ct I~-l) as I t I~ 00.

II. aII!II (t) = ~PJleitz" (mling the notations of ill, § 1) is thev

sum ofan absolutely convergent trigonometric series, and is thusan almost periodic function, I which comes as close to all as weplease for arbitrarily large values of t, so that lim sup IfII (t) 1=1.

I fT f t 1-+000We haves 2T _2.,1 lu(t) Illdt~::;p~/a¥I as T~oo.

III. lUI (t) is the c.f. of a d.~. FIll (x) which is continuous andhas almost everywhere a derivative equal to zero. It is possibleto show by examples' that fIll (t) does not necessarily tend to 0

1 fTas !t I~oo. We have, however, always 2T _pi 1m (t) Illdt~O as

T-+co. It will, in fact, be shown in v, § 1, that ifj(t) is the c.f. of3contin'U0U8d.f., the same holds true for I/(t> II, Thus the desiredresult follows by applying Theorem 6 to IfIll (t) II and putting,=0.

We are thus able to state the following theorems.

Theorem 7. If, in tke repreaentation of the d./. F (x) accordingto (13), toe have aI >0, then lim sup II(t) I< 1.

ft I-+-co

If aI = 1, then lim !(t)=O.It I-+-o:>

If all= l~ then lim sup IJ(t) 1=1.Itl-+IXJ

Theorem 8.5 For every c.J.f(t) wekave

1 f2'lim 2T If(t) 12dt=~p~,T-+co -T 11

1 Hoblcin [lJ, II, p. 514. • Besicovitch (1], p. 6. 3 Besicoritch [1], p. 19.t Cf. e.g.. JessESn-Wintner [1]. $ Levy [1], p. 171.

CHARACTERISTIC FUNOTIONS 27

the Pv ~eing the 8altuse8 of tke correaponding d./_ F (z) at all itsdiscontinuities .

Remark. It is easily seen that we Qannot have II (to) j =1for any to:F 0 unless F (x) is of type II~ and any two discontinuities ::cv differ by a multiple of 2wIto. Hence it follows that, iflimsuplf(t)l<l, then If(t)t<k<l for ftl~E>O, howeversmall E is chosen.

For a later purpose (cf. Theorem. 26), we shall in this con-nection prove the following lemma.

Lemma 1. If f(t) i8 a e·l· suck tkoJ I/(t) I~ Ie < 1 a8 soon (VJ

f t I~b, then we havefor It 1<6tl

I f(t) I~1-(1-k2) Bb2-

Froln the elementary inequality cos t ~ ! +! cos 2t we obtain

lJ(t) 12 = f:<of:=eilGz-tl)dF (x)dF (11)

=J:=J:oocost(x-y)dF (x)dF (1/)

~!+tl!(2t)12.

For b/2 ~ It 1< b we thus have by hypothesisI/Ct) f2~ 1-1 (I-lei).

Repeating the same argument, we conclude that forb/2n ;:i I t 1< b/2t&-1,

where n is an arbitrary integer, we have

I/(t) IJ~1- (t)~(1-k2)< 1-(1-k2)t2f(4b 2),

and thus I/(t) I<1-{I-k2)t2j(8b2).

As n is arbitrary, this proves our assertion for any t such that0< It I< b. For t=O, we have j(O) == 1, and thus the lemma, isproved.

2. Ifthe absolute moment Pk is finite for some positive integerIt, (24) may be differentiated k times, and it follows that !(v) (e)

28 CHARACTERISTIC F1JliCTIONS

exists as a bounded and uniforml:y continuous function forJI= 1, 2, u", k. Obviously !(v)(O)=ivrxv, t111d so we obtain byMacLaurin's theorenl1 for small values of f t I

k(25) J' (t) = I +I; Ct.!', lit)V -roo (j ell:).

1 V .. \ , .

For sufficiently small values of It j, the brafle~) of logf(t) whichtends to zero with t may be developed ill .:vIaoLaurin"s series upto the term of order k, and thus we na.ve, introducing a newsequence of parameters,

1:,(26) logf(t) = ~ y~ (it)V +. 0 (tk ).

1 v.

A comparisoll of (25) and (26) sho\vs that i~ is a POl:Y'Iloluia! inlXl' 0:2' .... , at., and that 1'1 =«1' 1'2 = et.J - Ctr. In the JJariiculaI-iyimportant case C(1 =0, we have

""1 = 0, Y2 = 0!2' "3 = 1%3' 14 = tX4 - 3(%;, )......

The Yv are called the semi...invariants of the distribll-tionu2

For any 16 ~ k, it follows from (25) and (26) that Ynln ! is equal

to the coefficient of zn in the development of log ( 1+ i: ::i zv)\ 1 v.

as a po,ver series in z. According to (20), this series is Illajoratetlby the series 1 !

[

CO C8: Z )J1] t:t) (ef!l:t z -1)V-log I-f-;r =~ v •

and so ,ve have1

Iy~ I~ ~ (coetr.. of Z'~ in ! eJf,a:z) ~ !!:i~~nor n·1 1 v 1~l

(n7) . 1< ,1.{J;;;: {Yn =n n·

3. According to (24), the c.f. f(t) is uniquely deterlnined bythe d.f. F (x). We now proceed to prove a group of theoremswhich show inter alia that, conversely, F (x) is uniquely deter...milled by J(t).

1 A form of the rf'ma.inder in 1t{acLaurin's series which yields (25) doss not oftenoccur in text..books. It is, however, easily deduced from the orqinary Lagra.nge form.

a Thiele [1].

CH.tiRACTERISTIC FUNCTIONS 29

1 fT l-e-illL foo-2 Q

te-itef(t)dt= ,pdF (x),

1T -T ~ -(0

Theorem 9.1 If F(x) is continuous/or x=g andJor x=,+li,we have

1 fT 1 - e-illt--F(e+h)-F(E)= lim -2 -t e-ite!(t)dt.

T-+aJ 7f.i -T 11

Before proving the theorem, ,,'"e shall use it to prove theidentity of any two d.f.'s F1 (x) and F2 (x) having the same c.f.f (t). As a matter of faot, Theorem 9 shows that the differences~(x)-Fl(Y) and F 2 (x)-F2 (y) coincide for almost all values ofx and y. If y~ - 00, it follows that F1 (x) = F2 (x) for ahnost all x.Since every d.f. is continuous to the right, the equality must holdgenerally.

In order to prove the theorem, we may clearly suppose k > o.Tilen

where

ifi=¢J (x~ g, k, T) =! fTsint (tX-1)dt- !fTsint (x;~ - h)dt.~~o ~ 0

Given any E' > 0, we now choose 8 such that the sum of thevariations ofF (x) over the intervals Ix-~ I~3and Ix-~ -·h I~8is less than E. This is possible, since by hypotllesis F (x) ifJ con...tinuous for x = gand for x = g+ k.

If T-+oo, while, and k remain fixed, l/J tends uniformly to 0in the intervals x < ~ - 8 and x > ~+ k +8, and to 1 in the interval~+8 < x < , +h - 8. In the remaining intervals Ix - ~ I~ 0 andIx-~-k I~o, we have t ifli < 2.

It thus follows that, for all sufficiently large values of T, the

f Cl) fl+h-aintegral t/JdF (x) differs from tiP (x) by a quantity of

-co f+3modulus less than 3£. If 8 is sufficiently small, the last integralcomes, however, as close as ,ve please t.o F(~+h)-F(')" Thusthe theorem is proved.2

1 Levy (l]t p. 166..I It is easy to show that,. if the definition of a d ..f. is modified, 80 that in a point

of discontinuity we put F (x) =t [F (x +0) +F (:c - 0)]. then Theorem 9 holds foraU values of , and h.


The integral appearing in Theorem 9 is, in the general case)only conditionally convergent as T 4 co. We shall now prove asimilar theorem which contains an absolutely convergent integral.For any given d.f. F (x) and for any k> 0, the function

If:t+kIi ~ F (u)du

is obviously a continuous d.f. The corresponding c.f. is found byl_e-ith

(24) to be itk j(t). Replacing in Theorem 9 F (x) by this

new d.f., we thus obtain for all values ofeand h

IfE+2h 1f'+1I.(28) Ii l+h F(u)du-ji I F(u)du

= 2~hf~",C-i~-Uhre-illf(t) dt.

Substituting here, for, +h, we obtain after an easy transformation of the integral on the right-hand side the following theorem.

Theorem 10. For all real, a'TUl for all k > 0, we have

~J:+h F(U)du-~J:_h F(U)du=~f~a> (m:tr e-2:eJ(~)dt.This can, of course, also be proved directly, without the use

of Theorem 9. \Ve are now in & position to prove the followingimportant theorem.

Tkeorem 11.1 Let {Fn(:c)} be a sequence 01 d.J.'a, arul {fn(t)}the corresponding seq:uence of c.f.'a. A nece88ary and 8ufficientcondition for .the convergence of {~l (x}} to a d.J. F (x), in everycontinuity point of the latter, ill that th.e aequence {In (t)} of C./.'S

1 In a slightly less precise form, this theorem was first proved by Levy [1], pp.195-197. Cf. also Bochner [I), p. 72. It should be observed that the theorem be·comes fa.lse if we omit the assumption that the limit I <t> is continuous at t =0.Choosing, in fact. In (t) =e-att, we h&ve J (t) =: 1 for t -=0, and / (t) =-0 for t=l=O. sothat! (t) is discontinuous at t=cO. Accordingly, the corresponding sequence of d.f.'s{Fn (x)} tends for every x to the limit P (x) Dli, which is not a d.f.

CHARACTERISTIC FUNOTION~ 31

converge8 jO'f every t to a limit J(t), which is continU0'U8 for the,c:pecial value t =o.

When this condition is 8atisfied, the limitf (t) is identical u'itk thec.f. of the lirniting d·l· F (x), and!» (t) converges to f(t) uniformly inevery fi'ltite t-interval. Tkia implies, in particular, that the limitJ(t)is then continuo'U8 for all t.

That the condition is n(Jce,,'1.~ary follows almost immediatelyfrom the definition (24) of a c.f. In fact, if~" (x) -+ F (x) in everycontinuity point of F (x), \vhere F (x) is a d.f., "'"e can choose

M =M (t:) such that if eil.zdFn (x) 1< E' for all n, E> 0 beingt.l'}>.tlI

given. In particular M can be so chosen that F (x) is continuousfor z = ± M. According to the theory of Stieltjes integrals, wethen have

f lJI fJleil.l: dFn, (x) -+ eil~dF (x),

-l}l -jJ[

uniformly in every finite t-intervaL The last integral differs,ho,vever, from the c.f.f(t) of F (x) by a quantity of modulus lessthan f, if M is sufficiently large. Thus It). (t)-:;..f(t) as n-;-.co,uniformly in every finite t-interval.

The main difficulty lies ill the proof that the condition is81.£jfieient. We then assume that In (t) tends for every t to a limitJ(t) which is continuous for t = 0, and we shall prove that underthis hypothesis Fn {a:)~F (x) in every continuity point of F (a:),where F (a:) is a d.f.. If this is proved it follows from the firstpart of the proof that the limit J" (t) is identical with the c.f. ofF (x), and that In (t) converges to f (t) uniformly in every finitet-interval.

In order to prove this, we choose fronl the sequence {~, (x}}a sub-sequence ~ll (x), ~t2 (x), .... , such. that ~tv (x) converges toa never decreasing funotion F(x),\ in every continuity point ofF (x). It is well known that this call always be done, and obviously we may suppose that F (x) is everywhere continuous tothe right. We shall now prove tllat F (x) is a d.f. As we alreadyknow that F (z) is never decreasing, and we obviously have

32 CHARACTERISTIC FUNC"rIONS

o~ F (X) ~ 1 for all x, it is sufficient to prove that

F (+co)-F (-00)= 1.

From Theorem 10 we obtain, putting ,=0,

(29) ~f:Fn.(U)dU-~f~1iFn.(U)du=~S:ao (si:tYfn. (i)dt.On both sides of this relation, we may allow v to tend to infinityunder the integral signs. In fact, it is easily seen that tIle convergence conditions for Lebesgue-Stieltjes integrals given inI, §3, are satisfied by the integrals occur.ring here, and we t·husobtain

(30) ~J:F (U)dU-~f~ll F(u)du

=~f:~(Si:trJ(~)dt.Let now h~ 00. As F (x) is a never decreasing function, the

first member of (30) tends to F (+00) -F (-00). By assumption

J(t) is continuous for t =0, so thatf(~) tends for every fixed t to

the limit 1(0). We have, however~ 1(0)=limfn (0), and In (0) = 1n-+o:>

for all 7L, BinceJ~ (t) is a c.f. Hencef(O) = 1. Applying once morethe convergence properties of integrals (I, §3), we thus obtain

F( +(0) -F( -OO)=!JC:O (Bil1t)2 dt= 1.1'( -co t

(The value of the last integral may be obtained, e.g. by lettingh~ co in (29).)

We have thus proved tha.t the sub-sequence {F1l.V(X)} tends toa d"f. F (x), in every continuity point of F (x). By the first partofthe proofit then follows that the limitf (t) of the correspolldingc.f.. ~s must be identical with the a.f. of F (x) ..

Consider now another convergent sub..sequence of{Fn (x)}, anddenote the limit of the new sub-sequence by F* (x), alwaysassuming this function to be determined so as to be everywherecontinuous to the right. III the same way as before, it is then

CHARACTERISTIC FUNCTIONS 33

shown that F* (x) is a d.f. By hypothesis tIle C.f.'8 of the nawsub-sequence have, however, for all values of t the same limit!(t) as before, so thatj(t) is the e.f. of both F (x) and F* (x). Butthen it follows from the remarks made in oonnectiol1 withTheorem 9 that we have F (x) =F* (x) for all z. Thus every convergent sub-sequence of {F11 (x)} has the same limit F (x). Thisis, however, equivalent to the statelnent that the sequence{F~ (x)} converges to F (:t), and sil1ce we h.ave sllown th&t F (~)

is a d.f., our theorem is proved.

4. Let us 110W oonsider a funotion R {x) whicll is of boundedvariation in (- co, +co), but not necessaril~~ monotone.. Theintegral

(31) r(t)= f~ooe'iCd.R(x)

is then bounded and uniformly continuous for all real t. InChapter VII, ,ve shall require the following theorem.

Theorem 12. Let R (:c) be of bounileil variation. in (-00, +00),and 8'uppose thAt R (x)~O atJ X~± 00, 80 that

(32) 1'(0)=f:"., d.R(x):=O.

Suppose further that the integral

f:1X> Ix 1·1 dR(x) I

is c011.vergem. For 0 < CI) < 1,for all real x and all h > 0 we tken have

J%+1&

(33) :r (y - X)<rJ-l R (y) ily

= _ -.!. fIX> : ~t) e-il3lat fA. uw- 1e-itu au ..2m., -00 t Jo

If, moreotu!,r, th,e integral

(34)

i-9 convergent, we have

f~ Ir(t) J

t ---- idt-co! t i


(35) R(z)= -~fco r(t) e-4lzdt.2m -co t

We observe t.hat the conditions of the first part of the theoremaresatisfied, inparticular, whenever R (x) is thedifference betweentwo d.f.'s with finite mean values. In this case, ,.(t) is thedifferenoe between the oorresponding 0.f.. '8.

In order to prove the theorem, \ve shall first show that bothmembers of (33) are continuous functions of x and h, when (JJ isfixed. between 0 and 1.. In respeot of the first mamber, this isreadily seen by writing this member in the form

hi»f: yt»-lR(z+ky)dIg.

In respect of the second member, we have already remarkedthat r (t) is bounded for all t, and by the argument used in IV, §2,we have 'r(t)=O(t) as t-.+-O. Moreover, we have for t'#:.O

(36) I fA u/.tJ-Ie-il:tldul=l! [ht u tU- 1e-iJ,d,fJ, II' <~,Jo twJo wftJwwhere G is an absolute constant. It follows that the integralwith respect to t in the second member of (33) is absolutely anduniformly oonvergent for all x and It, and accordingly representsa continuous function.

Without restricting the generality, we may thus assume forthe proof of (33) that x and :t+k are continuity points of R (x).

The second member of (3S) is the limit, as M ~OO, of theexpression

1 JAtI r (t) in,- -. - e-it3:at uw- 1 e-w d'U2,,-1. -,,'V t 0

1 [A 1.1.11 e-d(z+u) fco= - - m ut.U-1au -.-at eif.lI dB (y)1'( .-0 0 tt -co

= _!Jco dR(y) fAU"'-lilUJMsint(Y-~-U) dt.~ -~ Jo 0 t

According to the oonvergence properties of integrals (1, §3), wemay here allow ..r.11 to tend to infinity under the integral. Using

CHA.RACTERISTIC FUNCTIONS 35

the well-known properties of trigonometric integralst and observing that by assumption we have R ( ±00) =0, we then nndthat the second member of (33) is equal to

IJ:t+ll. AttJ-- (lI-X}OItlR(y)+- B (w+h).

W :.c OJ

An integration by parts now yields (33)..On the other hand, replacing the first member of (33) by the

expression just obtained, we obtain the relation(z+kJ,. (y-x)WdB(y}-h<»R(x+h)

(I) fCf) r(t) iA= --. -e-itJ:dt utu-le-it:udu

211t -r:t:! t 0

= ].foo r(t} e-il:cdt(ktl)e-I.th +it fA 'U/.lJS-itUdtJ,).21Tt -0;) t Jo

If we assume the convergence of (34), this gives (35) as CU~O..Thus Theorem 12 is proved.

CHAPTER V

ADDITION OF INDEPENDENT VARIABLES.CONVERGENCE ,elK PROBABILITY"

SPECIAL DISTRIBUTIONS

1. If X and Yare mutually independent randoln variables'With given d~f.'s }~ (:t) and FA (y), then by 11:, §4, the d.f$ of thecombined variable (X, Y) is F1 (~:) F2 (y). Thus the pr.f. ~ (S) of(X, }'") is,. accorfling to Theorem 2, uniquely determined by F1and F2 for all two-dimensional Borel sets 6.

The sum X + Y is a one-dimensional vector function of thevariable (X, Y), so that according to II, §6, its d.f. F (z) is uniquelydetermined by ~ (S), i.e. by F1 and F2•

Let C;z denote the set of points (",Y:, Y) such that X + Y ~ z.Then by definition F (z) = ~ (CSz).

Further, let Xn, be a sequence of real numbers steadily in...creasing with n· from -00 to +00 alld suell that xn+1 -xn <h forall n, where h is a given number > o. Denote by ffi1it the illfiniterectangle defined by the inequalities xn < X ~ xn~l' y ~ Z - Xn,

and by t n the rectangle defined by Xn < X ~ Xn+l' y ~ Z - xn+1

Obviously ~tnC ezcL8l,l' while I; (tRn - t n ) C<5e+k - ez- h , thesums being extended from n = - ~ to I'll = +co.

Since ~ (6) is a pr.f., this gives 'll.b ac~ording to (4)

Z~ (tn)~ ~ (e~) ~ L~ (ffi;n)

&11.11 ~~ (»1n ) - L~ (tn) ~ ~ (6z+ii ) - ~ (<S~_h)C

Th~ former inequality is equivalent to

~P~ (z -Xn~l) (Pl (x4 +1) -Fl (xn )) ~ F (z)

~ :EF2 (z-xn ) (F1 (xn~l)-Fl (xnJ),

while the latter shows that the difference between the limits thusobljained for F (z) does not exceed F (z+k)- F (z-llt). In every

ADDITION Olt" INDEPENDENT VARIABLES 37

continuity point of F (z), both sums thus tend to F (z) as h -70,al1d by the ordinary definitioIl of a "'Riemaml-Stieltjes~" in-

tegral,lthis limit is equal to J:<X:l~(z-x)dFl(X). According to the

definition of a Lebesgue-Stieltjes integral given abo'\?e (I, § 3, andIII, §2), the last integral exists~ hov/ever, for all values of z alld

is everywllere continuous to the right, so that it always representsF (z) .. Obviously F1 and F2 may be interchanged ,vithout alteringthe value of the integraL

..By 1'heoreln 3, any two functions gl (X) and g2 (Y) are mutuallyindependent:- 80 that we :have b:y' (22) E (glgf/) = E (gl) E (g2).As pointed out in 1\7" § l~ this :holds also if 91 and g2 are complex.Thus in particular

E (e1l(X+l"») = E (eitX ) .. E (eUY ),

so that we have proved the following theorem ..

Theorem 13Cl2 If X and Yare mutually i1uiepe1tdent randomvariable..s with tke d.f,,'8 F1 and F2 <: and tlte c·l.'8/1 and 12' then the8'ltrn.. X + Y has the d.f..

(~7) F (X) =f:<Xl F1 (x-v)dF2(v)= f:", F2 (x-v)dF1(t'),

and the c.f.

(38) j(t) =/1 (t)f2 (t).

When three d.f.'s satisfy (37), we shall sayS that F is CO'mposedof the components F1 and F2, and we shall use the abbreviation

(37a) F=F1 *F2=F2 *F1 *

According to (38) this symbolical multiplication of the d ..f..'scorresponds to a genuine multiplication of the C.f.'8"

If the three variables Xl:> X 2 and Xs are mutually independe:nt,then by Theorem,3 any X r is independent of the sunl of the other

1 Of.. Hobson [llr X, p. 538.t A rigorous proof of this long used theorem, which expresses the fundamental

property of the characteristic funotions, has nOli been given until compa.rativelyrecently. Cf.. Levy [1], Bochner [1, 2], Wintner [1], Ha.vlland [2]..

3 Some 'Writers use the expression: F is the convolution (German: Faltung) orF1 and Fa.

38 ADDITION OF INDEPENDENT VARIABLES

two variables. A repeated application of Theorem 13 then showsthat the sum Xl +Xs+X3 has the d.f. (PI*F,)*Fa=F; *(Ft*Fa),and the c~f.flf2f3.

Obviously this may be generalized to any number of com·ponents, and it is thus seen that the operation of composition iscommutative and associative. For the sum Xl +Xt +... +Xn ofn mutually independent variables we have the d.f.

(39) F=F1 *.F2*... *Fn

and the c.f.

(40) f=fJa .../n.If at least one of the oomponents Fv is continuous, it follows

from (37) that the composite F is also continuous.1 Similarly, ifat least one ofthe Fv is ab80lutely continuous, this holds also for F.If, on the other hand, all the F" have discontinuities" then F hasalso discontinuities, and the set of discontinuity points of Fconsists of all points x representable in the form

x =:Jf.1) + 2f.t> +.... + :r!-1l),

where :tf.v) is a disoontinuity point of Fv.2. Suppose that the absolute moments of order k are :finite

for all the mutually independent variables XI' Xaf ... , Xn.. Theinequality IXl+... +Xn Ik~nk-l(J X 1 1k + u. +IX n Ik ), whichholds in every point of the space afthe variables Xl' ... , Xn" thenshows that the lcth absolute moment of the sum Xl +.... + X n isalso finite.

Further, ifCl..1), «~), .•• are the moments ofFv' while «I' (ta, ••• arethose of the d.f. F composed aocording to (39), it follows from(25) and (40) that the coefficients oft, fl., .•• , tk in the l)l>lynomials

k IX, n ( k «(v) )1+ ~ ftr and n 1+ ~ --!ftr

"=1 r. v==1 r==l r.are identical. Using a symbolical notation, we may write

(41) «,.= (<«l) +«(1)+ •.• + rx!-ft»,.,1 Hen<'8 follows the truth of the statement made in IV, § 1: if / is the 0.£. of a

continuous w.tribution. then the ume holds for I1II =/.j.

ADDITION OF INDEPENDENT VARIABLES 39

where after the expansion of the rth power, every (Cf.,(v»P- shouldbe replaced by (J.~). In particular, we have (11 = L«~) andtX2-ctf=:E (~)- (<<y»2), in accordance with the relations alreadyobtained in m, §4.

The semi-invariants introduced in IV, §2, behave in a verysimple way when the corresponding probability distributions arecomposed. Let "r>, ...,'Y~) denote the k first semi-invariants ofFv' which are by hypothesis all finite. Then if 1'1' ••• , l'k are thecorresponding semi-invariants of F, it follows from (26) and(40) that

(42)

3. Let Xl' X 2, .... be a sequence of random variables. We shallsay that X n converges in probabilityl (briefly: "converges i.pr.")to a constant A if, for every e- > 0, the probability of the relationjX.",-A I>E tends to zero as n~oo.. We shall also say that X nconverges i.pr. to a ra.ndom variable X, if:the variable Xn-Xconverges i.pr. to zero.

A proper treatment of the questions connected with this modeof convergence cannot be given without introducing probabilitydistributions in spaces ofan infinite number ofdimell.si.ons. A fewsimple theorems will, however, be given here.

A necessary and sufficient condition that X-n, converges i.pr.to a constant A is obviously that the d.f. of X n -.A. tends, forany x¥: 0, to the particular d.f. E (x) defined by (17). By Theorem11, an equivalent condition is that the corresponding e.f. tendsto 1 for all t,.

If, for a sequence of variables Zl' Z2' "'I'" the mean value E (Zn)and the s.d. D(Z,,> are finite for aU ft., and ifD (Zn)~O as 11,-+00,

itfollows immediatelyfrom the Bienayme-TchebycheffinequaJity(m, §S) that ZfI.-E (Zn,) converges i.pr. to zero. From this

1 Cantelli [1], Slutsky [1], Freehet [1], Kobnogoroff [4-]. A full discussion of thevarious modes of convergence of sequences of random variables is contained in therecently published treatise by Frechet [2].


remark, we deduce at once the following theorem.

Theorem 14. Let Xl' X 2, .... be independent variable8 eruchthat E (Xll )=mn and D(X1l ) =(1'1" and put

Zn=~(Xl+... +Xn)' Mn=!(ml+···+ 11ln).n n[J crf+ ... +O'~=o (n2), then Zn -Mn converges i.pl. tozero.

We have, in fact, E(Z,J=M,t and D2(Zn):=~2(af+.•. +O;),

sothatbyhypothesisD(Z,n.}~O. In theparticular casewhen all theX n llave the same probability dlstribution~we have Mn =mu = m,say 'I and ai+ ..... +a;' = na2=0 (n2), so that Zn converges i.pr.. to 1?l.

If, for the independent variables X n. considered in Theorem 14,the existence of finite mean values and variances is not assumed,we Dlay still.ask if it is possible to find constants ..,lJ1:n such that

!(X1 + ... +Xn)-M'l=Zn-Mrr converges i.pr. to zero. When allnthe X", hav'e the same distribution, the following theorem holds.

Theorem 15.1 Let Xl' Xs, ... be independent variable.s all

having the sartte d.f. F (x), and put ZlI =~ (Xl+ ... +Xll ) • .A 'lUCe8n

8ary and 8ufficient condition for the exi8tence of a se1J.ueme ojconstant8 M1, H s, .. _8Uch that Zn -Nn converges i.pr_tozero, i8tMn

f dF (x)=o(l/z)Ixj .... z

as z~ co. This condition being satiafled, we can al1.vay8 take!

Mn = f:nxdF(X}.

1. The condition i8 necessary. Denoting as usual by f(t) thec..f. cOITesponding to F (;2;), tIle c.f. of Zn - M'A is

e-,Vni{J(;)]'l= 1+~ (t).

1 Kolmogoroff [1 Jt and [4-], p. 57. Of. a.lso Khintchine [4J.

J If, inadditlon, tbegeneralued mean vaJue Jl = lim J.~ zilF(x) e:\iats, it follows# ......00 -z

tha.t Zn convergets i.pr. to JI .. If the ordina.ry luean value as defined in III §2 exists"it is easily seen that the condition of the theoreln is ahvaya satisfied.. '

ADDITIOK vF INDEPENDENT VARIABLES 41

If Z1t -Mn converges i.pr. to zero, then according to theremark made above the corresponding d ..f. tends to E (x) andthus by Theorem 11 An (t) tends to zero, uniformly in every finitet..interval. Taking the n,..th root we have therefore as soon asJAn(t) I< 1

I _M",U (t) I I I J i~(t)1je "!;;:, -1 =1~l+AiI(t)-l,:ai;.l_I~(t)I'

while the left side is boul1ded by 2 for all nand t. Thus

J{~)=eM;ii + ~(n.,t)\11, n '

where 18(n,t) I~ 2n for alilt and t, and tends to zero as n~oo,

uniformly in every finite t-intervaL SinceJ(tjn)-.+l as n~co, itfollows that we have M1l ==o(n}. From Theorem 10 we thenobtain,. putting k = n,

IJ"l+1J, Ifg(43) - F(v)dv-- F(v)dvn e n f-n

If(O (Sint)' _tife( 2M

flit 8(n,2t»)

= - - e n e 1ft +-- dt.w _~ t n

Now) gMnil is the c.f.. of the d.f" E (x - Mn,), where «! (x) 1l;j detlnedby (17). The contribution to the second member of (43) arising

IM.it

from the term e~ is thus by Tlleorem 10 equal to the valueassumed by the first member if F(v) is replaced by E(V-Jfn )"

This value is, however, equal ttl zero if J g-M". i >'1t. For all ,satisfying this condition, we tllUS obtain, since 8 (n, t) tendsuniformly to zero,

~fl+$F (1j)dv-!J~ F (v)dtJ < '1J (n),ft ~ n e-~ n

where 1} (n)~ 0 as n -? 00. On the other hand we h~ve

!f€-t-nJ! (V)dv-.!Je F (V)dv=f,+n(l- Jv-f f)dF (v)

n t n l-n t-fi n~ t {F (e+ in) -F (~-tft,»,


and thus F ('+!n)-1'(,·-t-) < 27)(n)n

for all, such that Ie-Mn I> ft.. Since M",,=o(n), we may for allsufficiently large it put f = ±in, 80 that we obtain

F(2n)_F(n}<21J (n) a.nd F(....11.)-F(-2n} < 27) (nl.n n

Replacing here successively 1J, by 2"" 2In, ... and adding, weobtain the desired result, as the restriction ofn to integral valuesis obviously not essential.

2. Theconditioni88'UJfici~. Taking.M;.:::f:~~d!(~), wehave

by &, partial integr&tion

JM"nI~J" 1~ldF(:J:)==-nf dF{1J>+J"tkf dF(v)-# 11'1>"'· 0 1t'1>3:

=O(I)+o(f:~~1)=0(10811.),

and in the same way

fa ~dF(z)== -nJf dF(1J)+2fft~!kj dF(v):=o(ft.).-no 11J1>1It 0 tlll>:e

The o.f. of Z", - M. may be written

(") e-M.U~(~)J· =[1+f~O) (e"<Z;M,,) -1)dF(:J:)]".

Now we have by hypothesis

f03 it(:e-M,,) In U(3:-M.) (J ( t)

(e----- -l)d.F(x)= (e-·--l)tlF(:e)+~,-co -ft, 11;

where 8 (ft., t) tends to zero as n~00, uniformly in every finitet-interval. Accord.iI\g to the definition of M,., we may t~U8 write

f00 -it (1:- M',.)

(45) n -4*D(t-·- -l)tlF (:c)

f'"' (U(:A:-M.> II (z M))=" _$ e-·--l- : " tJF(3:) +8(., t).

ADDITION OF INDEPENDENT VA.1tIABLES 43

The :first i.ierm in the second member of (45) is, however, ofmodulus less than

::f~#(~-M,,)2dl!' (~),

and according to the above inequalities this tends to zero as.n-+a), uniformly in every finite t-intervaI. Since (1 +cx",)"'-)o.1if Mn-+O, it thus follows from (44) and (40) that the c.f. ofZ" - _~ tends to 1 as n~ co, uniformly in every finite ' ...interval.Then by Theorem 11 Zrt - M.", converges i.pre to zero.

4. Let Xl' X 2 , ...... be independent variablest and put

Yfl,=Xl+XI+~.. ·+Xn"

If Fy(z) is the d.f. a,nd!." (t) the c.f. of the variable Xv; the d.!. ofYn. is~*Fa*.... *F,v and the corresponding o.f. is flit ·../n- ByTheorem 11, a necessa1'Y and sufficient condition for the con-vergence of F1 *.Ft * .....Fn to a d.f.. F(~) is the convergence ofthe infinite product QC)

I(t) = TIJv (t),vaal

for an t, where j(t) is oontinuous for t=6.1 If this condition is8&tisfted, it follows from Theorem 11 that the infinite productconverges even uniformly in every finite ' ..interval, and tb.&t/(t)is the c..f. of &. d.f. F (z) such that

F (z)= lim 1; .Fs* .... *F",.,..-+00

n+tI"in every continuity point of F. For any n', the product IT Iv (t)-

",+1

tends uniformly to 1 &S n~ co, and consequently the differenceY.+nt - Yn. converges i.pr.. to zero. It would be natural to concludethat there is a variable Y with the d.f. F (z), such that Y" converges i.pr. to Y. Then Y would be the sum of an infinite seriesofrandom variables: Y == Xl+XI+..... In order to give a precisemeaningto a statementofthis oharacterit is necessary to consider

1 .A. IfI,IJi,cient condition for this convergence is the convergence of the two eeries~. (X,) and ED2 (X,,).

44 ADDI'l'10N OF INDEPENDENT VARIABLES

a probability distribution in the space of the combined '\~a'1abJe

(Xl) X 2, ••• ), whioh has an infinite number of dimensions. ThISfalls, however, outside the scope of the present work.

5. We shall now consider some particlllar examples of prob"ability distributIons. In the mst place, we consider a variableX which can assume only the values 1 and 0, the correspondingprobabilities being p and q=1- p. The d ..f. of this variable is 8,

U step-function" with steps in the points 1 and 0, of the heightp and q respectively, while the oorresponding o.f. is equal topett+q. We have further E(X)=p and D(X)=v'Pi If Xl'X a, .... , X n are independent variables all having the same distri·bl1tion as X, the sum v=X1+X2+ .... +Xn is equal to thenumber of those X, which assume the value 1. The c.f. of thevariable v is (peit+q)n, and v may assume the values 0, 1, .... , n,

the probability of a. given value v being (:)P" qn-v. This distribu

tion is usually called a binmnial or Bernaulli diBtTibution, and 11

m&y be Concretely interpreted as the number of white ballsobtained in a set of n drawings from an urn, the probability ofdrawing a white ball being each time equal to p. By m, §4, wehave .E (J1) = np and .D (v) = vinpq. According to Theorem 14,the "frequency" vln converges i.pr., to p. This result coincideswith the classical Be'frWUlli'8 theorem as originally proved byBernoulli.

If we allow the quantity p to vary from one X,. to another, theo.f. of the sum v beoomes

B tao

(46) n (Preit+lJr) = II (1 +Pr(e'd-l».1 1

In this ease, we have E(v)=~p". and D(v)=JiPrq." and by1 1n

Theorem 14 the variable (11- ~P1')/n converges i"pr. to zero" If1

QO

the series ~Pr is convergent, it is seen that the c..f. (46) tends to a1

ADDITION OF INDEPENDENT VA.RIABLES 41

limit as n.-.?co, uniforml~'" for all real t, so that the case consIdered at the end of the preceding paragraph presents itself.

Another case of convergence is obtained if, in (46), we allowthe 'Pr to depend on n in suoh a way that, when n~ 00, each 'P,.

ntends to zero, while L Pr tellds to it constant A> O. (We may e.g.

1

take Pr ='Ajn for If =- 1, 2, _.. , n.) Then the c.f. (46) tends to thelimit

(47)

This is the c.f. of a variable which may assume the values

0, 1, 2, ... , the probability of any given v being ~:e-.\. The meanv.

value of this variable is '"\, and the s.d. is VX. The semi-nlvariantsYp. defined by (26) are aU equal to A. This distribution is usuallyoalled a Pois8on distribution. If Xl and XI are independentva.riables both having Poisson distributions with the pa.rametervalues Al and A2, the expression (47) of the c..f. shows that thesum Xl +X 2 has a distribution of the same kind with the parameter A1 +1\2- If we denote by F (z, A) the d.f. corresponding tothe POiSSOll distribution, we thus have the relation

(4S) F (X, AI) *F (xt Aa) = F (x, i\1 +~).

6. The probebility distribution defined by the d.f. F (x/a),

where a> 0 and

F(x)=!+!arctanx, F'(X)=!'-l1 2'7/' 11' +x

is sometimes called ()auchy's dist14ibution ..1 This distribution has

not a finite mean value. sillce the integr8.11 f~ fIX I~ does not-GO +x

wnverge, althougll the "gerleralized mean value" lim ftJ :x;dF (x)Z~QO -s

does exist and is equal tiC zero. By an easy applicationofCauchy's

1 Cf. Levy (1],. p. 179.

46 ADDITION OF INDEPENDENT VA.RIABLES

theorem we find the c£.

IfCX> i:tzf(t)=- ~dx=e-llj.'11' -<:ol+~

The c.f. corresponding to the d.f. F (zla,) is then obviouslyf(at)=e-aUl. We thus have

j(a1t)j(a1t)=/(a1+Qs)t),

or F (x/a,.> *F (x/at) == F (xJ(a1 +(1.»)so that the Cauchy distribution reproduces itself at the additionof independent variables. If Xl' XI' ... , X"" are independentvariables all having the Cauchy d.f. F (z/a), the arithmeticmean Zit. ={Xl +... +Xn)/n thus has the same d.f. F (zja).Hence we cannot in this case find constants Mn such that Zn - M"converges i.pr. to zero. It is easily seen that, accordingly, thecondition of Theorem 15 is not satisfied.

As our next example we take a d.f. F (x; «, A) which is equal tozero for x ~ 0, and for x> 0 is defined by

(1.'>. f~(49) F (x; at• .\) == r (A) 0";'-1 e-W 00, ((It> O. A> 0).

This is a distribution of " type III" according to the classificationintroduced by K. Pearson.1 All moments of the distribution arefinite; the mean value is ).;1«., and the s.d. is VA/«. The o.f. is

~ f~ 1f(t; at, A) = r (A) 0 xA-1

e-(O:-'U)$ck= (l-~)·

This shows that we have the expression i'po = (JL - 1)J«-tJ. A forthe semi-invariant 'Yp, of the distribution, and that, for a, fixedvalue of «, the d.f. satisfies vrith respect to the parameter Athesame relation (48) as the Poisson distribution.

7. In many applications, it is required to find the distributionof the quotient of two random variables. In certain cases, the

1. Ct. e.g. Elderton [1]..

_4..DDITION OF INDEPENDENT VARIABLES 47

follo\ring theorem enables us to ex!>ress this distribution in termsof the c.f.'s of tIle t\\"O vBJriables.

Theorem 16. Let Xl and "'¥'2 be independent variablelJ withfinite m,ean values, the correspondirl1} d.!. '8 being F1 (x) and F2 (x),u'ith the c./.'811 (t) and Is (t). If F 2 (O)=O, and ijthe integral

f:o li2 :t) jdt

con~'erge8, then the d.!... G (x) of tll.e quotient X I/X2 i8 given by tke

relation G(x) ==~f co /2 (t) -/1 (t)fl ( - tx) dt.21110 -GO t

If the integral obtained by formal differentiation of thill relationwith reS1Ject to x i8 uniforrnly conv8'rgent in a certain interval, wetlbUS have in tki8 interval tor the frequency function G' (x)

0' (x) =-21 -Itt) /1 (t)f~ (-tx)dt.m _'0';)

By definition G (x) is equal to the probability of the relationX1/X2 ~ x. Since F2 (0) == 0, we need only consider positive valuesof X 2, so that the last inequality is equivalent to Xl-XXI ~ 0,and if H (g) denotes the d.f. of the variable X I -XX2 , we thusobtain G(x)=H (O)=H (O)-F2 (O). By hypotllesis the differenceH (,) - F2 (,) satisfies the conditions ofthe Remark to Theorem 12)and so we obtain, since the c.f. of H (,) isll (t)fs ( - tx),

1 Jco e-'if~H(g)-F2 (')=-2· -t-(!2(l)-fl(t)!a(-t:e)dt.

7T1, _ QO

Putting here g= 0, ",·e obtain the theorem.We shall give two examples for the 8Jpplicatioll of this theorem.

In the first l>lace, ,ve consider two variables Xl and X a, bothdistributed according to (49) witll the parameters IXl' A1 and0:2, A2 respectively. In this case the theorem gives

G(X)=_1 JO::> ( 1 1 ~ )~~21T1: (it )AJ

( it )/\ ( itx)A~ t"1--- 1-- 1+--- C(J (X2 0:1 IX2


If;\2 is an integer, the integral may be calculated, and we find byan easy application of Cauchy's theorem. G (x) =0 for x ~ 0 and

for x> o. In the particular case "2 =1, the last expression reduces

to G (x) =( (t,1 X )A1•

(X,lx+cxaFor our second example, we shall anticipate the discussion of

the normal distribution that will be given in the followingChapter. We shall consider a quotient of the form X1!v'X,.,where Xl is normally distributed with the mean value 0 and thes.d. cr, while XI is distributed according to (49). We then have(cf. (51))

andIX>' f«) 2«A fCi:)It (t) =r (,\) 0 xA-1 e-Gt.l:+itv':lIax == r (A) 0 v2A-1 e-<Xr'+itlJdv,

2ia:.A fco1" (t) :=-- vIA e- lXv2+Uv dvJ2 r (A) 0 •

In this case we may apply the last forlnula of Theorem 16, and80 obtain for the frequency function G' (~) of the variableX1!1/XS

G' (z) =~fco e-lutl:ae] QOvll e-o:.",1.-U11;z dv.,.,r (A) -<X) 0

==~fcovIA e-(¥~dvf ~ e-laIP-Uv;c titnr(A) 0 -co

= 2tr' f 0()vi"e-(0<+:;.)",tWO'V271 r (1) 0

= 1 P("+!){l+~)-v\-!.V 21TtX02 r (I\.) \ 2cxu2

This is a distribution of type VII according to the classification

ADDITION OF INDEPENDENT VARIABLES 49

ofK. Pearson.. In the particular case wIlen 2tXu2 =n, A=nJ2, weobtain a distributioll defined by

r(n+ 1) _,.+1

G' (x)= v~; ·r(i) (1+~) 2

"rhich is kno\vn under the nanle of "'Student'8" distribution.,l

1 '" Student ,) [1]~ Cf. also e.g. Rider [1].

CHAPTER VI

l'HE NORMAL DISTRIBUTION ANDTHE CENTRAL LIMIT THEOREl\I

1. rfhe nQrllw,l distribution Junctionl (x) is defined by the

relation 1 J':r -~ (a-) = -=_ e 2dt..

V21T - ao

The corresponding nortnal frequency jU1wtion isa..~

m/( )__1_ -2\V x - ... ;-_€ ..

"V 21T

The mean value ofthis distribution is 0, and the s..d. is 1, as shownby the relations

(50) J:tO xd$(x)=O, J:tO x2dct> (x) =1.

The lnoments of odd order 0:.2v+1 all vanish, while

Ct2v = f (t) x2v d (x) = 1 .. 3 .. u.. .. (2II - 1)"-co

The c..f .. is, by a well-known integral formula,

(51) f<:O eUxd (x) = 1 ftO eitJ:-~dx=e -j.- co v'21T - co

Hence "\"'6 obtaill, for v = 1, 2, ...... , by partial illtegration

(52) f:tO e,txdct>(") (x) = (-it)" e-~,

and by differentiation1 ftO itz-~

~(V)(X)=9 (it)V-1e 'dt"..lTf _ 00

A random v"ariable X is said to be normally dist1"ibuted, if its

1 The normal distribution was discussed already m 1733 by De :Moivre in thesecond supplement to his J.lli~9Cellanea Analytwa.. Cf. K. Pea.rson [IJ. It was afu.r.... ards treated by Gauss and Laplace, and is often referred to as the Gauss orGausb-Lrtplace distribution..

NORMAL DISTRIBUTION 61

'x-m)d.f. is 4> (-0- , where (J~ 0 and m are constants. (The case

q=O is, of course,. a degeBerated limiting case which might be

caJled an improper normal distribution. cJ)e~m) should always

be interpreted as £(x-m), where E(X) is defined by (17).) The

normtJ,lized varia.ble X - m has then the d.f. cJ) (~), and we obtain(J

from (50) E (X) =m, D(X)=a,

while (51) shows (cf. also IV, § 1) that the c.f. of the varia.bleX isE (eUX ) =etn1.t-iaJlt.

The semi-invariants of X, as defined in IV, §2, are

"1 =m, i'1=a2, 'Ys="4.= •.. =0.

2. We now proceed to prove a number of theorems whichshow that the normal distribution plays a fundamental part in& great nl1mber· of questions connected with the a.ddition ofmutually independent random variables.

Let Xl and X J be independent and normally distributedvariables t the parameter values being ml' at and m., as respectively. Then the sum Xl+X2 has the composed d.f. (cf. v, § 1)

~e~~)*~e~~2),while the corresponding c.f. is

eml it-iaitt •e1ntU-io;" = e(ml+mt>U-l (oi+oi>tI.

This is, however, obviously the c.f.. of a normal distribution, andso we have the following theorem.

Theorem 17.1 The BUm oj two inilepe:ndem and fWrf1U.I8,y

distributed tJariaJJle8 is itBelfnormally di8trib1detl ~. fJllU8

~ (~~~)*~ (~~~2) =cJ) (~~m),

where m = m1 +ml' at = af+ai-l ThiJ! theorem is som.etimes attributed to d'Ocagne, but it seems to b.a.ve been

known aJready to Poisson and Cauchy, and possibly &Iso to Gauss.

52 NORMAL DISTRIBUTION

Obviously this theorem is immediately generalized to thecomposition of any finite number of normal distributions.

We shall now prove three theorems which attach themselvesin a natural way to Theorem 17 and reveal fllrther remarkableproperties of the normal distribution..

According to Theorem 17, the d.f.'s of the type c;I) f~: tn) fol'Ill

a closedjamity (the "normal family") with respect to the opera..tiOll of compoBition~ Now, any d~f. with a finite mean 'lalue m

and a finite s.d. a may be written in the form F (~) , whertla I

F (x) is a d..f. with the mean value 0 and the s.d. 1. For any given

F(x) with these properties, all functioIlB F (x:m) may be con·

sidered a8 a family generated by F (x). Our next three theoremsthen assert (1) that no F (2:) different from ~ (zo) generates in thisway a closed family; (2) that the composition of any two d.f.'swhich do not both belong to the normal family never produces amember of tha.t family; and (3) that every d.f. with a finite s.d.gives, by n-fold composition with itself, a d.f. which for all suffi·ciently large fa, comes (uniformly for all real x) as clo~as we pleaseto a member of the normal family. We shall first give the formalstatements of the three theorems and then proceed to theproofs.

Theorem 18.1 Let F (x) be a d~f. with the mean value 0 and thes.d. 1. If, to any con8ta1ll8 ?nt, tnt (real) ani/, at, 0'1 (poaitiV6), 106

can find, m aM (1 suck that

(53) F(~:~)*F(X~~2)=Fe~tn),

then F (x) =$ (x).

1 P61ya [1). The example of Caurhy's distribution {v1 § 6} shows that, in thiatheo1"etu, it is essentia.l tha.t we consider only d.l. 's with finite dispersions. Furtherexamples of non-normal d.f.'s satisfying (53) havebeen discussed by polya. andLevy (1].

NORMAL DISTRIBUTION tiS

Theorem 19.1 If the BUm of two independent raruiA»n variablesia 'fl,()'ff1W,lly distributed, then each variable i8 itself 1Wrmally distributed. Thus if~ (z) and F2 (x) are d.J.:'8 8uck that

(54) Fl(X)*F9(X)=W(x~m),

then Fl(X)=(,i)(X:~l), F2(X)=ep(X:~2), where m1 +m2=m,

01+01=0'1.Before stating the third theorem, some preliminary remarku

are necessary. Denoting the composition F *F * .... *F of 11,

equal components by Fn$, we obtain from Theorem 17

(cD (~))n* =(J) (x-m..!!.)er \ ayn " '

and in particular for 1ft = 0, (J = l/,\/n,

(55) ( (xyn})n* =fIl (x).

The last relation expresses that if Xl' ..... , X. are independentvariables, all with the same d.f.. ep (x), then the variable

(Xl + ..... +Xn)!v'nhas the d.f. tb (x).

Theorem 20.2 Let F (x) be a d.f. u'ith the mean value 0 arui the8.d. 1. If Xl' XI) .... are independent 1JariaCl~ all having the d.,l..F (:t») then the a.f. of the variable (Xl + ... +X'nJ/vn tends to CI> (x)aa n-;.co~ uniformly for all real z. Th'U8

(55a) {F (xvn»n* -+4> (x)

uniformly in x. Hence it follows also that

(56) ( (X- m))n* (X - mn)F -- -fIl -- ~Oa ay'n

uniformly in x, for all fixed m and a.

1 Cramer [5J. The theorem had been conjeotured by Levy [2J, [3J. It will beobserved tha.t in thi'i theorem it is not assumed that the moments of a.ny order arefinite..

S! Lindeberg [1], Levy (1], p. 233.


Theorem 20 is a particular case of the famous "Central Linl1tTheorem" in the theory of probability, which will be more fullytreated in the following paragraph. We shall now first proveTheorem 20) which will then be used for the proofof Theorem 18,Finally, we shall prove Theorenl 19.

Proof oj Tn.ehfem 20. If f(t) is the c.f. of a d.f. F (x) with<Xl=O and «2=1, it follows from formula (25) of IV, §2) thatj(t)= I-lt2 +o(t2) for srnall values of f t I. Thus we have uniformly in every finite t-interval

I(_t)=l-~+O(.!.)vn 2ft, n

as n~OO4 The e.f. of the variable (Xl + .. ,+Xit)/yn is...

As n ..00, this tends uniformly in every finite t-interval to thet'

limit e-', which is the c.f. ofW(z). ThuB by Theorem 11 the d,f.of (Xl+... +Xn)/V'" tends to ep (~). The uniformity of the convergence follows easily from the fact that ~ (~) is continuous..Thus (554) is proved, and (56) follows immediately from the

remark tha.t (F e~m)f* is the d.t. ofthel&riable

... / Xt+···+Xn.mn+uv n v'n ..

Proof of Theorem 18. Both members of the relation (53) ared.f. '8) and the first order moments are m1+ms and m respectivelyt

while the variances are at+crI and a2) so that we obtainm=ml+fnt, aI=at+crI· Putting ""l=fnt= ... =0, we obtain byiteration ..

56NORMAL DISTRIBUTION

and thus in particular

(F (xy'n)"'* =F (x).

From (55a) it then follows that F (x) =~ (x) for all x.

Proof of Theorem 19. Let Xl and X 2 be liWO independentvariables vvith the d f.'s 11and F t , and the c..f.'s 11 and !'Z) and

suppose tha.t Xl -l- X 2 ha.s the d.f. cI> e:m). Since the qua.dra.nt

Xl~X) X2~Y is a sub-set of the half-plane Xl+X2~Z+Y.. wehave for all values of x and y

.F;. (x)F2 (y) ~cI> e~+~-m).

Here we choose for 11 any fixed value such that F I (g) > 0, and usethe mequality

1 -~4> (x) < vi e 2.

217' t x Iwhich holds for all x < 0 and is easily proved by partial integra...tion" It then follows that we can determine .A and B independentof x, such that for all x < 0

ttfAF

1(x) < Ae-2ut+BI :r

,•

Similarly we can determine A' and B' such that for all ~>O

-~+B'%1-.F;. (x) < A'e Itr •

From the two last inequalities it follows that the integral

(57) J = f~ec et:·dJ;, (x)

is convergent.. If, now, we Qonsider the c.f~

A(t)- f~"" eiLrdJ;, (x)

for complex 1Jalues oj the variable t, it follows from the c.onvergenceof (57) that the integral which represents ,.ll (t) is absolutely anduniformly convergellt in every finite domain in the t-plane. Thus

56 NOlt.MAL DISTRIBUTION

11 (t) is an in:tegral funrJio1ll of the complex variable tv For themodulus of this function we obtain by lneans of the elementaryinequality x2

J tx I~ 0'2 f t f2 +4(12'

I11 (t) I~S:""e0'1 till- ia.d.F" (x) = J ei7lltlt,

so that the order1 of tke integral junction /1 (t) does not exceed 2.In the same way it is proved thatfa (t) is an integral function oft,the order of which does not exceed 2. ACCOlding to (54) we have,

however, 11 (t)/a (t) =emit-laIl2,

which shows that 11 and 12 are integral functiona w~th{,"1(.,t zefOS.

By the classical factorization theorem2 of Hadamard it thenfollows thsJt

(58) 11 (t) = e!11(t), 12 (t) =eqa(t),

where 21 (t) and q" (t) are polynomials ofdegree not greate.r than 2.The convergence of (57) implies that all moments and semi...

invarianta of Xl are finite. Denoting the mean value by m1 andthe s.d. by at, we then obtain from (58) according to IV, § 2,

IV, §2, 11 (t) =emlit-ia1,tt,

and similarly

This is, however, equivalent to

(x-m1) (x-m)Pl (x) =$ --u;- , PI (x) =$~ ·

Then obviously m1+n~2=m and af+oi=a2, and the theorem isproved.3

1 (,1'. e.g. Tltchma.rsh (1), p .. 248. • Cf. e.g. Titehmarah [IJ, p. 250.3 0'1 or 0t may be equal to zero If, e.g., at =0" we have by § 1 to interpretIX - ,n,,)

tP (- as ((x -tn.1), and so obtatn the trlvialsolution of (54); F1 (*)=c- (~-nl-t)., °1

Fa (=)=\1) (~ -n:,+~).

NORMAL DISTRIBUTION ~37

3. The Central Limit Theorem1 in the theory of probabilityasserts that, under certain general conditions, the sum oj a largenumber of independent variables i8 approximately norrnally di8tribuled. In Theorem 20, we have already met with a particularcase of the general theorem, viz" the composition of n equalcomponents with a, finite s..d. We shall now consider the casewhen the components are not necessarily equal. Throughout tkiaparagraph and tM immediately foUowing one, we shall supposethat every component 1148 a finite 8.d. and a mean, valtU eq'Ual tozero. The assumption that the mean value is zero may obviouslybe made without 108s of generality, since it is equivalent to thesimple addition of a constant to each variable.

We thus consider a sequence of independent random variablesXl' Xi:; ... , such that Xv has the mean value 0 and the s.d. O'v"

The d.f. of Xv will be denoted by 1:, (x) and ~he o..f. byIv (t).Ifthed.f. ofthe sum Xl+ ... +Xn is denoted by~ (x), we have

(59) ~(X)=Fl(Z)*F!(x)* .•. *Fn(x),

and F", (x) has the mean value zero and the variance 8: given by

(60) 8;=a}+oi+ ... +0';·The variable (Xl + ..... +X nJ/8n then has the <i.f..

(61) iYn (x) == F'n, (8n X)

with the mean value 0 and the s.d. 1. It is possible to 81ww thalunder fairly general oonditions iJn (z) teM8 to the normal ill.<lJ (x) a8 n tends to infinity. The Inost important case is that inwhich the following two conditions are satisfied:

(62)

1 This theorem was first stated by L&pl&ce, a.nd was further trea.ted by severalm&thematicians during the nineteenth century, notably Tchebychetf a.nd Markoff..A complete and rigorous proof under f&irly general condItions was first given in 1901by Liapounoff [1], [2]. Of. Vlt § 4, and va. § 4. A eomprehensive a.ccount of themodern development of the subject is given by Khintehine [2].. The central positionwhich the Limit Theorem occupies in the Theory of Prob&bility is well brought outin this beautiful treatise.

6S NORMAL DISTRIBUTION

(64)

(66)

~

This means that the total sed. of ~ Xv tends to infinity) while1

eaoh component contributes only a small fraction of the totals.d.! In this case, we have the following theorem.

"fheorem 21.2 Let Xl' XI' .... be OJ sequenoe of independentra'Nlom variablu UJi,th vQ/nl£8kirtg mean val"1U8 and :finite B.d.'a8ati8fying (62), and denote by ijn (x) the a.J. of the mriable(Xl +... +X,,)/Sn as ilefi'PAYi by (59) Olnd (61).. Then a nect!Jl8ary and.sufficient corttditiOfl, for the 'Validity oj the relation

(68) lim iYlt (z) =<1> (x)110--). co

for all ~ is that, fot any given E > 0,

Inslim "2 L x2dFv{~)=O ..n.--+- co8on 1 tz t>~'JI

It is readily seen tha,t Theorem 20 is contained as a particularcase in Theorem 21. The condition (64) is known as the Lindebergcondition. It is here given in 8- slightly simpler form than thatoriginally given by Lindeberg.

In order to prove the theorem, we denote by fn. (t) the c.f.which corresponds to the d.f. ~n (t), and then obtain from (59)and (61)

(65) f,.. (t) ==/1 (t/8n,)Ja ('IBn) ••• f", (liB",).Now, for any integer k > 0 and for any real a we have

~ k-l (ia)V a"e'"= L - +.&-,o vI leI

using .& as a general notation for a real or complex quantity ofmodulus not exceeding unity. We shall first prove that the oondition is 8ujJicie:nt, &Ild thus ,asaume that (64) is aatisfied for anygiven€>O. Taking in (66) 1c= 2 for I~ J > E8",andk= 3 for I:l: f :iE'_,

1 Excepting the trivial cue when 8" =0 for all n,. it ia eaaily Ieen that (Ii) :isequivalent to 1'" C-.a:) -+ c (.) for every fixed ~ '&0, uniformly for J' = It t. ..... 1&, asn ...... GOt where • (z) is defined by (17).

: Lindeberg [1], Levy [1], Feller [1]. It can be shown without difJioult1 tha,t •condition (64) implies (62). TIlus as a. 'UJfdewl condition (84) is independent of (a).

59NORMAL DISTBIBUTION

we obtain for I t I< T, where T> 1,

f" (t/8,,) =f: tI) e'::tlF"

-1- tSsI 3:2tJ,F1/+& PSI Izi'dF"2Bn 1:z:1~t"'" ~ Izl:it-.

+& T2f zldF..28; I~I>"." .",.

a~ 2 &P3( --~ f 9..1F.)=-1- " ...~ t +~ Eo;+ x UJ, 11 ,~ ~ l~f>~.ft

bearing in mind that the mean value of X., is equal to zero. From(62) weobtaineasilyuv/an -+-0 as n~C(),uniformlyfor v== 1,2, ... , n.Thusfv(t/8nJ-+-l, uniformly for Jtl<P and v=1,2, ... ,n. Itfollows that we have

log/v (tIs",) =(1 +17) (Iv (t/8'J~)-1),

where I '1J I< € for aJl sufficiently large n. As we may obviouslysuppose 0 < € < i, we thus obtain

log!" (t/sn ) = - 2~ t!+ 2&'[3 (E(1~+f :C1d,F.).8n. 8n Ix 1>..,-

Summing over v= 1, 2, .... , n, we obtain according to (65) for0< t" < t and It I< T

log ff& (t) =-~22+2&P'(E+ -~I:f x2dFv).~ 1 l:tI>E'S"

€ being arbitrary, it then follows from (64) that we have as n-...oon t2

(67) log fn (t) = f logj" (tI8,,) -... -"2

uniformly for ttl < T, and by Theorem 11 this is equivalent to(63). Thus the condition (64) is sufficient.

In order to prove that (64) is also nece.88ary, we assume that(63) and thus also (67) is satisfied.. From (62) we obtain as above(Jp!8.~o. Using (66) with k =2, we have

Gat!J~ (tI8?!) =1-&~2 ,

11,


so thatJv(t/8n )'-+ 1, uniformly in the same sense as above, whilen:I: 1Iv (t/snJ - 11 is bounded for every fixed t. Since (z - 1)/logt,~ 11

as z~ 1, it then follows from (67) thatn t2

(68) ~(/v(tI8n)-1)~-21

for every real t. According to the Bienayme..Tchebycheff inequality (nI, §3) we have,. however, paying regard to (60),

iif dFJI(x)~~,1 , x j :> eSn E'

and so obtain from (68), taking the real part,

(69) lim supl~22 - iif (1- cos tx)dFyl ~;.,,-. co 1 Ia: 1~f'8,. 8n. E

On the other hand, we have

iif (l-costX)dJi:,~ tile iif x2dFJI~~'1 Ixl~~B1t 8n 28ft 1 i X'1~t"B,a 2

Introducing this in (69), we obtain

t2• Inf 2

O~-2lim sup! I: x!dFv~I"n.~ co 8n 1 fzl>E"8n If!

Since t may be taken arbitrarily large, it follows that the con·dition (64) must be satisfied, and thus Theorem 21 is proved.

If the conditions (62) are not satisfied, one of the following twocases must occur:

(A) lim 8n =8 exists; or,,-.,.0)

(B) 8n -+-00, GnJsn> tX> 0 for an infinity of values of fl...

IntheC&~ (A), itfollows from (61 )that the relation tin (z)~<Il (x)is equivalent to ~t (x)~4> (x/s). Thus by (59) the infinite composition F1 (x)*F2 (x)*.u converges (v, §4) to tJ>(x/s}. PuttingG (x) =F2 (x)*Fs{x)* ... , we then have F1 (x) *G (x) ='1> (x/a).From Theorem 19 we then obtain Ft (x) =<1> (x/a!), and itfollows thatf in case (A) the neces8ary and aujficient condition for~/. {x)-+ (x) ia that each variable Xv is normally distributed.


In case (B), on tIle other hand.. for values of 11, such thatfln,/S,., > tX > 0,. the s.d. of Xn is not small compared to the tot.a}

ns.d.. of ~ Xv. It is then easily understood that (6S) O&llDOt be

1

satisfied unless the d.f.. 's of these "large" X:t, tend to the nonn&}type. We shall, however, not enter upon a detailed discussion ofthis case.

4. A 8ufficie1tt condition for the validitjf of (63), which is oftenuseful, has been giv~n by LiaIlounotl'.l Let Pky= FJ (f Xv tk )

denote the absolute moment oforder k of the d.f.. Fv (x), SO that inpartioular P2v=(T~. Suppose that for some k> 2 (not necessarilyintegral) f3kv is furite for all v and is such that

and thus the Lindeberg condition (64) is satisfied, so that byTh~orem 21 (cf. also p. 57, footnote 2) we have ~n (z)~$(z).

If) in particular, there are two positive constants M and m,such that for all v we have fJltv < M and Ps" =:~> m, it is obviot18that the Liapounoff oondition (70) is satisfied, and thus fJ.", (~)tends to ~ (a:).

5. We shall now apply the results of t.he two preceding paragraphs to some particular examples"

As a first example, we take the variables Xr ==~ - Pr!'t where~ has the simple di~continuousdist,ributio:n. oonsidered in v,. §5:1';.= 1 with the probability 'Pr' and l~= 0 with the probability

1 Li&pounoff (2). By meaDS of the condition (iO), Liaponnoff obtained a.n upperlimit for the modulus of the difference ~n (x) - <b !z). This result will be proved inthe foBowing OMpter (cf. Theorem 24).

NORMAL DISTIUBUTION

f,_I-p,.. We then ha.ve .E(X,.)=O~ Da(X,,)=~==.Pt'f,. and~a.--.E (j X,.I8)=p,q,. (p~+~) ~Prq,., 80 that

~ ~ PSr~ (:E.P7Qr)-t.lY1L 1 1

Putting, as in V, §5, v = Yi +... +Y1P so that v represents thenumber'ofthoea ~ which assume the value 1, we have

nv-!:pu.. _X1 +,··+Xn _ 1 r

,.- 8,. - (fP,q.)*'If the series ~p,.q,. is divergent, the Liapounofi condition (70) issatisfied, and thus the d.f. of the variable Un tends to ~ (%) asn-+C(). If, on the other hand, 1:,prQ,. is convergent, it fonows fromthe a.bove discussion (case (A), p. 59) that the d.f. of U. does nottend to (,() (x), since the variables X,. are not normally distributed.

In the particular case when allPI' areequal to j), where 0 < 11 < 1,the series ~Prqr is obviously divergent, 80 tliat the d.f. of thevariable Un=(v-np)/Vnpq tends to ~(x). It follows that forany fixed A1 and ~ the probability of the relation

Al < (v-np)!vnpq < 'At

tends to the limit .} f"* e-~dt. This is the extended form ofv 217' .\,

Bemoulli's theorem proved by De Moine and Laplace.As a second example we consider the variables X,. with the

distribution

,- ret with the probability 2:StJ

'

X r == 01

" 1--rlil'

rill 1u 2r2«'

Obviously E (X,,) =0 and DS (Xr)=a:= 1, so that

~=of+... +~=1t.

NORl\IAL DISTRIBUTION 63

Thus (62) is satisfied<t and by Theorem 21 a necessary and suffi...cient condition for ty,,, (x) -+w (x) is

lim.!:. L 1=0.on-+- co n l~v~n.

VO&>€vn

It is readily seen that this condition is satisfied for C( < i, but notfor ~ ~ i. For at> I, it is indeed obvious that the distributioncannot tend to the normal type, as in this case we have a prob..

IX>

ability greater than n (1 - r-2Cl) > 0 tllat any sum Xa+... +X.,.,2

aSSUlnes the value zero. The Liapounoff condition (70) is satisfiedfor (1.. <!, but not for tX ~ 1.

6. If, for the illde})elldellt variables X n considered in Theorenl21, the existence of finite mean values and variances is notassumed, we may still ask if it is Ilossible to find constants atl. andbn, such that the d.f. of (Xl + .... +XnJ/a1l. -bn tends to ~ (x) asn 400. The same question may, of course, be asked in a casewhen finite m~an values and variances do exist, but the Lindeberg condition (64) is not satisfied. We shall not enter Upoll adetailed discussion of the problems belonging to this order ofideas, but shall content ourselves with proving the following twotheorems.

Theorem 22.1 Let Xl' Xs, .... be a sequence of independentrandom variables, and denote by Fv(z) the d.f.. of Xv. If, for asequence aI' aI' ... of p08itive 'number8, tke conditions

(71) lim :E I dFv (x) = 0,n-+-co v==l Izl>ean.

(72) lim 12 i I x 2dFv (x) = 1,21--+ CD an v===l 1a:1::iE'a.

(73) lim ~ I; II ~dFv(X)I=o,tt.~(lOan v==l fxl:i£~

1 Feller [11" It is there further 8hown that (7lH73) are neceasary for the con...vergence to tIJ (;l:) of the d.f. of any variable (B"X1 + ... +8n Xn)fan, where 3, == ±l.


4re sati8fied for every E > 0, then the a.f. of the variable

(Xl +... +Xn)!antends to <1> (x) as n -+00.

If, in the particular case when every Xv has a, finite s.d. anda mean value equal to zero, we take an =8n, as in Theorem 21,it is easily found that the conditions (71)-(73) reduce to theLindeberg condition (64).

In order to prove the theorem, we denote by f, (t) the c.f. ofXv. According to (71)we may, to any gi'\""en £ > 0, chooseno=no (E)such that. for all n > no

tJ dF.,,<Ea1 Izl>"CI,t

NOBl\lfAL DISTRIBUTION

for all sufficiently large n. Thus we have, as n~ 00,

Iv (t/afl')-l ~O,n t2

I: (Iv (tlan ) - 1) -7 - -2'v-=l

ft.

and lini Slip ~ (f., (tla,,) -1 J~ ti ,p-l

uniformly for v= 1,2, ... ,ft... It follows that for every tn t2~ log/v (tjan )~ - "9 '

va;::;} "'"

n _!.111or nf,,(t/an.)-+e 2,

p=l

The first member of the last relation is, however, the c.f. of thevariable (Xl +... +Xn)/a"" and thus by Theorem 11 the theoremis proved.

We shall no\v consider the case when all the variables Xv havethe same probability distribution.

Theorem 23.1 Let Xl' XI' ... be a 8equence oj independenttJariables a·11 having the 8ame d.l. F (x) .. If 'lee ~ve

(74) f dF(x)=o(~f X 2 dF(x))J:rl>1: z I~J~z

tU %400, then.-

(1) The ab80lutemoment f;Jr= f~.) ;:cl"etF(:eJi" fittitefor O;;i r < 2,

J~ J

80 that in particular a finite mean val?:e m. = _ f$JQ;tl:F (x) exists.

(II) It is p08sible to find a 8equ.ence "1' tZ.z, .... ojpositive numbers.taM that the d.J. oj the variable

('15) U. X 1 -+- .•• +Xn -ninn an.

te?uU to fb ($) fJ8 n~ tX>.

1 FeDer [lj. [2], Kbintchine [3], Levy (3]. It is shown by these authors tha~ (?4)malso & ?1Cfe88M'y condition for the existence of" two sequences {aft} and {on} sUbhthAt the d.t. of (Xl + ..... +X•.>/o,ft - b.,. teilds toO ~ (X). On the other h.a.nd. ('4) is nota necessary condition for the eort"texpnoe of p,. for 0 ~r<2.

66 NORMA.L DISTRIBUTION

For the proof of this theorem, we may obviously assume that/32 is not finite, as otherwise (1) is trivial and (II) is an immediatecorollary of Theorem 20.

'Ve shall first prove that Pr is finite for 0 ~ If < 2. The fun<.'tion

t/J(Z)=f x2dF (x) =-z'I dF(Z)+2j"vdvi dF(;t)Iz I-'~ f~l>e 0 Ia: 1>1.1

is never decreasing for z> 0 and tends to infinity with z. By (74)we have (z-?co)

,p (z) ~ 2J·~VdvJ dF (x) == 0 (III t/J (v) dV) .o l:e!>'t' 1 V

E> 0 being given, we denote by M (z) the upper bound of v-eljJ(v)in the intervall ~ v ~ z. and then obtain

J: ~;f) dv ~M (z) J:1)~-l dv <z~~ (z) •

Thus we have z-~+ (z) == 0 (M (z), which shows that'" (z) =0 (Zf)for every E > o. It follows that, for any fixed r such that 0 ~ r< 2and for all suffioiently large z,

J I~ I' tJ,F (x) < 7/-tJ/J (2z) < %,,/2-1,.<I:z:J~2. '"

alld this obviously implies that fJ, is finite, Hence in particularthe mean value m is .finite.

We now proceed to prove the assertion (II). As by hypothesis/32 is not finite, the first member of (74) is positive for all z> 0)and the function

(76)

Z (tt) == lower bound of all z> 0 such thatI dF (x) ~ 'U,la:I>.

is a positive and never increasing function ofu" uniquely definedfor 0 0

(77) f tiP (x) < 'Y/ ~)f :.r;tdF (x).1a:1>* z 1:l:1~.e

Let {An} denote 8, decreasing sequence of numbers sucb that0<"1&<1 and

(78)

We put

;\,.,,40,

(79) z'" =Z (AnIn) , a~=nf x2dF (x),l:t f~Zn

and are now going to show that, with this definition of an' thed.f.. of the variable Un defined by (75) tends to ~ (x). PuttingX.,=X.,,-m, we have U,.,,=(X1 + .... +XnJ/an, the d.f. of each Xvbeing F (x+m). We now apply Theorem 22 to the sequenceXl' .It, ... , and then only have to show that the conditions(71)-(73) are satisfied if we put Fv(x)=F(z+m) and define anaccording to (79).

By means of (76) and (79) we obtain

(80) f tiP (x) ~ ~IJ,f tiP (x) > An ,t~I>... n 12:1>1.. fl,

and further according to (77)-(79)

(J~~.!!.J zldF (x) > n f rJ,F (x)~ ~ I~f~iz" 4TJ (lz,,) l:cf>P'a

> A.=- 4fJ(lZ(~)) 400,

so that Z1f, == 0 (a,.,). E" > 0 being given, we now choose no such thatfor all n > ito we have Zn < !fran and Im J < tc:an , and then obtainby (80)

nI dF(x+m)<ft,f dF(x)-+-O,lad>... Ixf>Ztt

68 NORMAL DISTRIBUTioN

so that (71) is satisfied. We have further for 11,:> no

l a~I x2dF(x+m)-11n 1:t'1~('a. I

=~ II (x-m)2 dF (x>-j dcSdF(x) Ian IX-1nl~Eall 'zl~zJe I

< ~ IlfJ (m2 -2mx)dF (x) 1+ ~ €2a:f dF{~)an f;rJ~s. I an f;J:I>~

< (m2+ 2,s1 im I) ~ +e2n f elF (x).an .. l.itl>%ft

According to (79) and (80) the last expression tends, however, tozero as n-..+oo, so that (72) is also satisfied.

Thus it only remains to show that (73) is satisfied. By (74) wehave for every fixed S> 0 and for all sufficiently large z

zJ Ix IelF (X)=zsf dF (:t)+zjtDdvf dF (x)Izi '>z ixl>$ 11 Ixl>'"

< 8+ (z) +8zJ'".o+~)ail~ v

=2&P(z)+8z f IxldF(x),eJ l:.cl>~

a.nd consequently, putting z= lEan'

for every fixed E>O, aa n-+co. By (79) and (80) we have, however, for all n > no~

and thus by (81)

NORMAL DISTltIBUTION 89

~J' Ix IdF (x)-+O.aft Ia: I>icGtt

Finally we have, the mean va.lue of each Xv being equal to zero;

~; r XdF(x+m)I=~llf xdF(x+1n) Iani .. 14:1:S;~aB an l:ct>4Ed,w I

2nf 2nf<- Ix+nl,ldF(:t+m)~- Izld~(~}-+O.an 1:I>fa,. an I:t:J>i~

Thus (73) is satisfied, and the proof of Theorem 23 is completed.

CHAPTER VII

LIAPOUNOFF'S THEOREM.ASYMPTOTIC EXPANSIONS

1. In VI, §3, we have considered a sequence of independentvariables {Xn} such that X.,.. has the d.f.. Fn, (x) with the meanvalue zero and the s.d.. CT",. As in VI, §3, we ptlt

and 8~=oi+·.. ·+O-;(82) iYn (:c) ==F1(8n X) *.... *Fn (8n X),

so that tin (x) is the d..f. of the variable (Xl +... +Xn)/sn. Thecorresponding c.f. is then

(83) f-n (t) ==11 (tlan ) .. ·ftt (tis",).If the Lindeberg condition (64) is satisfied, it follows from

Theorem 21 that tyn (x) tends to the normal function 4) (x) asn-+-oo. It is then natural to try to investigate the asymptoticbehaviour of the difference ~ft, (a:) -~ (a:). In this respeot, itmight be desired: (1) to find an upper limit for the modulus ofthe difference ~n (x) -tI» (x), and (II) to obtain some kind ofasymptotio expansion of this difference for large values of fl...

In the present Chapter, both these questions will be treated.In the first place it will be shown (Theorem 24) that, under fairlygeneral conditions, we have 1trn (x) -fIJ (x) I < K logn/von, whereK is independent of n and x.. It will then be shown (Theorems25, 26) that, subject to conditions ofa somewhat more restrictive i

character, an asymptotic expansion of fYn (x) -4l (:I:) in powers of11,-1 can be obtained. From this expansion follows, in particular,the relation I~'A (~) -4l (x) I< KJvn, which is ~n improvementofthe preceding inequality. In the last paragraph ofthe Chapter,we shall make some remarks concerning the relations betweenour asymptotic expansions and the expansions in series ofHermite polynomials which have been widely used in applicationsto mathematIcal statistics.

(84)

ASYMPTOTIC EXPANSIONS 71

2. PnlfO'lJ,fJ1wut the whole Okapte:r, we sW Ct.Wl.8itkr a aeq'Uefl,Ct,X1J XI' ... of independent ra,nd&m, 'lXlriable8 suck that X", has tkemean vaJ,m zero aM the 8.d. aft. The trivial case when aU the an areeq:uJil to zero will al1.lJ4118 be ezel'UiJ,e,d. The vth order morn,t,nt, absolme.m,om.,ent and 8emi,-in'OtJriam (ef. IV, §2) 01 tke variable X n wiU beden,oted by ~n' fJYA and Ym f"upectivelll_ Tk'U8 in particular

(Xlft. =Yl" == 0, Ott", ==13,,, = rift. = a:.ThrougluYut the whole Ohapter it will be (J,IJ8'U/ffteil that twe exi8t8

aninteger k ~ 38'UCkthat flkn isfinitefrn all n =1,2, .... ItthenJollow8that (XVI" f3vn and i'vn arejinitefor v= 1, 2, .... , k. In the particularease when all moments are finite, k fIUl,y be chosen as laf'fe Q,8 weplease.

We aooll1l8e the letters & and ere to denote 'u/nJ1peci:fied quantitiearuck that I& I~ 1, while 10k I is l.e.,s thn,n a number dependingonJ,y on k.

All the results of this Chapter take a particularly simple formin the case when all the variables X ,1, have the same d.f. We shallrefer to this case as the case of equal components, and the commond.f. aftha variables X n will be denoted by F (a:). If, in this case,a denotes the s..d. of X n , we have 8",=a.yn, and the relations(82) and (83) become

tTn (x) = (F (axV n»1l.*, fn (t) = (!(t/(aVn»)'fI,.

3. In this paragraph we shall deduce some lemmas that arerequired for the proofs of the results indicated in §1. We put forv=2, 3, ... ,. Ie

1 1B",,=-(Pvl + ... +~m)' rm=-(rvl+... +y...,.),n f1,

B m \ r lin

pvn, =Br:./2' ''vn == r,,/I ·In ~

Thus for v =2 we have B21t =r In == B~/n, P2-n =As" = 1. B"", is thevth absolute moment of the d.f.. (F1 (x) + ... +Fn (z»/n, and thusby (20) B~ never decreases as v increases from 2 to 1c, 80 that wehave for 11= 2, 3, .. _, lc

(85)

72 ASYMPTOTIC EXPANSIONS

(90)

It follows from (42) that n-(JI-8)l2Av1I is the 11th order semi.in.variant of~n (x). Further, it follows from. (27) that trvn I~ v"'BlYI.)and hence

(86) IAmI ~ t/l1pvn. ~ (kkpltn)v/k.

In the particular case of equal components B llfU r V'lo' 'PV1tJ andAV1f. are all independent ofn, and we have B.",., = fJ"" r va. =rS', wherePv and 'Yv denote the 11th order absolute moment and the vthorder ~emi..invariant of the common d.f_ F (x).

BesiCles the case of equal components, we shall also sometimesconsider the case when the following condition is satisfied: it ispossible to find two positive constants g and G such that for all"

(87) B2n >g, Bkn < G.

Obviously this case includes the case of equal components. If(87) is satisfied, it follows from (85) and (88) that Pm And AItft areuniformly bounded for all 11,~ 1 and for '11=2) 3, ..• , Ie.

We now consider the c.f. fn (t) of the variable (Xl+... +Xn)/'fl'Jas defined by (83). Putting

vn(88) Tkn. = 4 I/k'

. Pkn

onr first object will be to show that in the interval It I~ '\0/21:8there exists a certain expansion of f. (t) which.. in the particularcase of equal components, becomes an ordinary asymptoti.cexpansion in powers of n-t .. (In the case of equal components,Pkn is indeJlE'ndent ofn and thus Pkn. is, for large values of f'l" of thesame order of magnitude as Vn.)

Lemma 2.1 For It! :i '\o/Tkn we havet1

k- 3 P. (.) 0(89) eiT (t)=l+ L 2!!-!!.+_k <ltlk+ltI3{k-2»

In v=l nl12 T~;t t

wherev

p. ( -t) '" ( ·t\v+2jv?t t = "'-i CJvn i I;=1

is II polllnon~ialof degree 3v in (it), the coeJlicient cjvn beittg a pol,..

1 CramsI' (2].

(94)

ASYl\-IPTOTIC EXP.ANSIONS 73

nornial in ,\31t' A41l , ••• , Av-:l+3tn with 'lI/umerical coefficients, 1J1UJ'h, thatv+2;

(91) e;m = 0 k Pk: .ThU18 in the oa-se of equal componentB Pvn (it) is independent of n,while in the more general case when (87) is 8ati8fied the coeJficietz:t8of Pv" (it) are boundedlor all n.

For every r= 1, 2, ... , n we have by (66)

k -let (it)V fl (t)kU ~fr(t/81,J-l = ~ --1 ~ +& 1..~ -- •

v:=r 2 v. 8.,.. 1(;. 811.

For It f ~ {!Tk7~ we obtain, however, by (8~) and (88)

PI/k It I (nB )l/k~n !_!(93) ~n ~ (nB:)lIi p1/f: =n

k 3~ 1,

and thus we obtain from (20)

IU I~ ~ ~ (f311 i t I)V ~ e - 2 < f.v-2 v. \ 8ft

For I U I<! w~ have, however,Ui

log(l+U)= ~ (-l)j+l--;-+ekUkll~l~i<kll J

According to (92) U is~ formally, a polynomial in t (in reality. thefactor.& depends of course on t), and the series

co !- (,8~ I t f)V~t~=2 v. \ 8n

is a Dlajorating expression for this polynomial. For any powerVi, where 1 ~j < k/2, we thus obtain from (92) the expansion

; _ k-"Ql ~ (~)Y CO! (jfi~k It 1)11U - ::E ~!~1" +,& L t

v=2, 8 n v:::-k V • \ 8 n

k-l I it)V 8k tk

= ~ 3vir \f:- +0~/ ~-,",.2; 811./ 8 n

with coefficients 8vjr whioh are independent of t. From (92) and(94). we thus obtain an expansion of logfr (tj871J in power$ of it,up to the term containing (it)k-l, and with an error term of the


order tIc. According to (26), the coefficient of (it/8,,)11 in this expansion is~ however, equal to Ywlv 1, 80 that we have for t , I~ ~Th

k-l y (it)V p..,.tklogJr(tjstt,)= ~ -T - +@k--.:r·

v-2 JI. 8n N"fi,

Summing here over r= 1, 2, ... , 11" we obtain according to (83)and (84)

k-l",r (it)V nBlmtklogf,,(t)= ~ --r - +@k----:r-

v-I v. 8n ~

= _~+n.k~lA.m(i!-.)V+0k'nPk'A(-t )k.2 v-3 vI yn v'n

Substituting tz for t and dividing by Z2, we have

tl

1 k-3 A (it)V+2 ( Z )11 {k ( Z )k-o.V:=log{e2 (f (tz)i'}=!; v+l.tJ, - +e!kft -

n v-I (v+ 2)1 Vn le! \inIf we regard here t and n as fixed, and z as a, real variable suchthat Iz I~ 1, we thus have for the function V =V (z) an expansionin powers ofz, with an error term ofthe orderzk-i. Then obviouslythere is a similar expansion for the function eV, 80 that we maywrite for Iz I~ 1

~ !. k-3 ( Z )"(95) eY ==e9(l" (tz»-' =1+ ~ P~(it) J +B(z),

V" 1 :v nwhere R(Z)==O(zk-l) ~ %-+0. It is then readily seen that thecoefficient PVA (it) is a polynomial of degree 3v in it, which may beput in the form (90).

According to (86), & majorating series for V == V (z) is

(96) 8& (PJ/: It I)s1=1 !: ~(p~ Itz I)V ,V'np=ro v. v'11,

and thus VI is, for j == 1, 2, ... , k - 2, majorated by

(97) 0 k (pl':l t l)31(J!1)1 i ~(jp~ftzl)".'\1'11, 11-0 v.. Vn

From the developmentk-S Vi

eY:=:E -:;-+.&Vk-2e1YI; ..0 J.

ASYMPTOTIC BXPANSIONS 75

we thus obtain, since the majorating series (96) shows thatI IV I< arc for It I~ ~Th'

R (z) =0","'i;' (Pi': It 1)S1(l!l)1 i ~ (jPM: Itz J)V:1-=1 v'n v-k-2-1 v. vn

== Elk(pJJ:1tz J)Tc-ll 'j;' (Pit: It/)2/ i; I, ('!:-)'V~# 1-1 v-ov. ~n

= 0", (.;,,)1&-2{(pJ/: It l)k+ (Pi': It 1>3(k-t>}

=0;~~.(I t I'" + It 18 (k-1».

Putting z= 1 in (95), we thus obtain (89). Finally, the relation(91) for the coefficients c/VIt follows immediately from the majorating series (97) ifwe observe that, in the expansion (95), a termcontaining the product (it)J'+11 (z/vn)" can only arise from thedevelopment of the term Vijj!. Thus Lemma 2 is proved.

We next consider the following Lemma 3, which gives an upperlimit of I fA (t) I, valid in the interval It I~ Ph. IT the behaviourof the absolute moments PtA and Ph for large values of n. is nottoo irregular, Tim as defined by (88) tends to infinity with n, 80

that the interval It J~ t'Pb of Lemma 2 is, for all suffioientlylarge 'It, contained in the interval It I:iTIe",-

Lemma 3.1 FiYI' It I~ 21. we n.aloe

~ We have

Ifr(t) 11 = S:..J:<J)cost(:t:-y) d.F,. (x) d.F,.(lI).

but cost(x-y) ~ I-ttl(~-y)l+tIt18 Iz-yl3

~ l-it2(Z2_~+y2)+iIt 13 {I X13+11118),

1 Liapounoff [2].

76 ASYllPTOTIC EXPANSIONS

and thus for J t f ~ Tkft, we obtain-11: t'+!~ ""l!r(t)J2~I-P!rts+tP3rltJ3~e Us" ,

fa -iJ+~ PIn It lat fn (t) 12 = n Ifr (tI8"J 12 ~ e 3- v/n ,

,-1i'( I t I ) is

Ifn (t) I~ e- i 1-32'0 ~ e-'3.

Thus Lemma 3 is proved.If, in the polynomial P", (it), we replace each power (it)1I+2:1 by

(_1)1'+11 4l(J1+!1) (2:), we obtain a linear aggregate of the derivativesofthe normal function til (x), that will be symbolioally denoted byp1m (-ttJ). Thus by Lemma 2

v(98) Pvn. (-0) == L (_1)1'+21 c;vn. fIl<V-+2:/) (X),

i-I

where elm is a polynomial in the quantities~ such thatc - I:),. p(Y+IS)/k

III." - 'elk k'n •

Obviously we may writef»'

·(99) Pm (-¢l)=Pav-l,n (z)e-i ,

where PSp-l.n (:c) is a polynomial of degree 3v-1 inz. In the case6f equal components, PPft, (-<I» and Pall-l,n. (x) are independentof n, and in the more general case when (87) is satisfied, thecoefticient.s CjV1t as well as the coefficients ofPSv-..l

tn (:t) are bounded

for all n. Aooording to (62) we have

(100) p. (it)e-i=J"" eUa:dP.,.(-4l).Vtl -<XI

We now define two"enor terms" R1t;n, (z) and rkn. (t) by writingthe following expansions for the d.f. tT,.. (~) and the o.f. fn (t)

k-3P. (-cIJ)(101) if~(:c)=~(x)+ L vn vll +R1tfti(~)

v-=l n

=4\ (x) +k~3P3"-1.n(X)e-i+R (x)V== 1 nFl'S kn'

_~ k - 31'. (it) _t·(102) f", (t) =e j + ~ :11/1 - e 2 + rkn (t) ..

v-=l III


FroID (100) we then obtain

rkn (t)= f:""eiixdRkn (x).

Lemma 2 shows that we have 'rkn (t) = 0 (tk ) in the vicinity of t == 0,and by the argument used in IV, §2, we conclude that

f:"" xvdRk1/.(x)=O

for v=O, 1, ....,k-l. Thus in particular Rkn(x) satisfies the con..ditions of Theorem 12.

\Ve now proceed to the proof of tIle follo,\\Ting lemma. which isfUlldanlental for tIle rest of the chapter..

Lemma 4 ..1 Fo'r 0 < (/) < 1, we have fo'· all1·eal x and all h> 0

(103)

f.1:+11 - (ICO f fn (t)J 1)w (y-x)W lRkn (y)dY=(::)k tw+1 dt+ pk_ i -z Ph ~

If titre integral in the 8econcl member of tlitis relation is convergentforw= 0, we further n.ave

(f OO Ifn(t) I 1 )R'en (x) = 0 k - -t-dt + pk-2 ..

'ltkn kn

For the proof of this lemma, we shall suppose that Tkn. > I,so that ~Tk11.< Tkn - If this does not hold, only tiri\rial modifications are necessary. (It will appear below that the conclusionswhich will be drawn from Lemma 4 are all trivial in the caseTkn~ 1, so that this case is not really interesting.) ~"ronl (33)

l In the brat edItlon of tillS 'rrac1, LLmlna 4 was stnted In a dlftclont ftlrlll whJ(~b

Impbcs, ill pa.rtJcular~ that if the first nu:nlOCu· of (103) is leph\('cd by

MJ~ '" (.G' - 1/1'"-1 R411 III) dlj.

we ttbt.aut a, rolation valid for U <w~1 -1. Tn thltt form, the J..,emms, ,rus glven by(""raJner [2], and a,pphed to the study of the asymptotic I)ropcltie$ of certain integralaverages of }'n (~)_ (Cf. below, p. 84.)

78

we obtain

LIAPOUNOFF'S THEOREM

and further, using the inequality (36),

ICdJ:+h(y - X)W-lRim(y)dyI< 0f: Ir~~) IdJ

~o(S:I ~:2; Idt +A1 +Al!+As),

\vhere 0 denotes an absolute constant.. For AI' A 2 and ..&3 wehave, on account of Lemmas 2 and 3,

\vhich completes the proof of the first part of the LamIna. Thesecond part is proved in exactly tIle same way, using (35) il1steadof (33).

4. In this paragraph, we shall use Lemma 4 to prove thefollowing theorenl. which is due to Liapounoff..

Theorem 24.1 Let Xl' X:a, .... , X n be independent variables81JCh thal X p has the mean. value zero aM the 8.d. C1" and put

1 Liapounoff [11, (2]. It is possible to show (cf. Cramer [11, and [2)~ p. 19) thatwe ma.y "ta.~e ('==3.. In the works of ~sseeI?- and Gnedenko..Kobnogoroft quoted onp. 119, It J8 shown that the factor logn In the evaluatIon of the error given inTheorem 24 may be omitted. Of. the Remark on p.. 82 l>tt'low'. The evaluation th.usobta.ined 18, in a. certain sense, a be&t..possihle one.

LIA.POUNOFF'S THEOREM 79

8;=af+ ... +0;. If the oosol'llk mom,ent Par=E 1 the ineq:ucility

lognI ty" {z)-4> (z) I< °P81f, Vn '

wheJre 0 is an ab80lute ccmstant, and P31" i8 dejineiJ, by (84).(It will be remembered that, in the particular case of equal

components, Pan is independent of ?it, while in the more generalcase when (87) is satisfied, Psn. is bounded for all n.)

Without 10s8 ofgenerality, we may in the proofof this theoremassume Pan.> 100, as in the opposite case we have

Pa.!Vn = 1/(421n)~m,

so that the theorem is then trivial.Let us denote by X n+1 an auxiliary vari&ble independent of

Xl' XI' ... , X", and having the d.f. F.+1 (~)==~ (Z/(€8.,f,»' whereE is a parameter such that 0 < E < 1. In the notation used inthe preceding paragr&phs, we then have for the sequenceXl) X2,' ... , X n+1

8~+1= 8~ (1 + E2),

... .r;-:--o (Xv'1+.:1)~n+l(X)=~n(:Cv l+Ei ).cIl --E-- ,

( )e'"

ft&+l (t) = f. v-t- e-~) •

1+E2

We now apply Lemma 4 to the variables Xl' .... , X n+1, puttinge= 3. It is then obviously permitted to use the second part ofthe Lemma as, according to the above expression for fn+l (t),the integral ocourring in the second member of (103) is abso-lutely convergent for 6)=0.. Replacing xVI +E1 by 2:, we thenobtain

(104) lit" (z).<1l (~) -$ (v-=' 2) 'I < To 0 +f'" e-2(r:..)dtE 1 +E 3.n+1 TSft +1 t

< 0(~+ 2T~ e-1ctTa..+I) ,:.L8,.n.+l E 3.n+l

where 0 is an absolute constant. During the rest of this proof,

80 LIAPOUNOPlr'S THEOREM

we shall use the letter G to denote an unspeoified absolute con.stant. We have further

~n(X)*~(;)=f:«> fJn(X-t)d4l(~),

and hence deduce, denoting by A> 1 a new parameter,

(105) ~n(X).~(;)~f~=t»(;)+~ft(X+1u)f:49(;)o -~<Ae s +iY1t.(~+1E}.

(106) ffll(X)*4>(;)~~,,(x-1u)f~Q)d4t(;)() -~

>~n(~-1J,E)-Ie 2.

From (104)-(106) we obtain

iYA{X+hE»~(V'x )-o(ie- f +p--!.-+ I~ e-~7'i.-+,),1+Et . .111 3, n+1 € 3,1&+...

( %) (1 -~ 1 1 )tT,,(x-hE)<4l 4/1+1' +0 he ! +Pa,n+l +E1Tl'Hl e-lf':7"",,+. •

Replacing a: in the first inequality by % -1H:, in the second bya:+Jw, and using the relation JcIJ(G)-4l(b)t~la-bJ,we havefurther

(107) 1~.(X)-4l(vl:E,)1

(1 -~ 1 1 )

<0 1H+h:6 I+~ +~T! e.-."7'1"+1 it

3,1f,+1 ".+1 I

We now dispose of the parameters hand 4! by taking

.. /--- Vlog 1:A-v 210gThJ E==3 ~ In.

a_

From the assumption T",> 100 it then followa that we have

1"IAPOUNOFF'S THBO&Ell 81

11,> 1 and 0 < e < 1. Further, according to (84) and (88) we have

T. v;;.:+:I _~ (1 +11£1)1 > Pan. .."_+1= 4p"1H-l - 3It 1+8J;c3T8ft, 1+200(lO~)t

> 2OO~~lOO)1>tP-,1+ 1001

and hence 1«'Tl.+1 > log Paa-

Introducing in (101), we then obtain, since

!() (v'l:ot=)-4l (a:)I< O~.. log~ logn!~A (:1:) -4l Cal)! < 0 T

k3r. < apIA v'n •

and the theorem is proved.This theorem is directly a.pplicable, e.g., to the Bemoulli

distribution considered in v, §5, and VI, § 5, in which case we ha.vePa", == (1- 2pq)/vPii. Thus if ~n (~) denotes the probability of therelation (v-np)lVnpq~:t, where v is the number of white ballsobtained in a set of n drawings, the probability of drawing 8,

white ban being each time equal to p == 1 - q, we have for all on > 1

lognIa. (z)-(J) (z) t < 0 ~ ;-,ynpq

where 0 is an absolute constant.

5. We now return to Lemma 4 with an arbitrary k~ 3. Inthis paragraph, we shaJI consider the particular case of equalcomponents. It will be shown that~ in this case, it is possible togive &, very simple sufficient condition for the existence of anasymptotic expansion of the difference ~~ (a:) -$ (z) in powersof .,.,-1.

In the ca.se ofequal components, the moments etc. introducedin §§ 2-3 are independent of 11-, so that we may write p", Pv andPa.-l in the place ofPh- p..... and PaIJ-l.".


We shall say that & d.f. F (:1:) aoJ,iBjie8 the ~ititm, (C) if, forthe corresponding c.f.j(t), we ha.ve

(0) limsupl/(t) I <1.Itl~oo

By Theorem 7, the condition (C) is certainly satisfied if, in thestandard decomposition of F (x) according to (13), the coefficientaI of the absolutely continuous component is different from zero.We now proceed to prove the following theorem.

Theorem 15.1 Let Xl' XI' .,.. be a sequence oj inilependemooriable8 all hOlDing the same d,.J. F (x) with the mean value zero,the I.d. (T, and a finite absolute moment Pit oj order k a; 3.By ~'" (2:) =(F (azvn»)n* we denote the a.l. 01 tke fXJriable(Xl+ ... +Xn)/(UVn). If F (z) 8ati8ji,tJJ the condition (0), we the""have the expansion

1:-3 R (-4l)(108) ffn (~)==4l (z) + I; v JIll +Rk3 (~)

)Ilia 1 1lt

k-3 p (3:) :t'-~ (~) +:E a;'~ll e-'2+Bien (x),

1'-=1

withM

(109) IBin (z) I< n(1c-2)/2'

where M deperulB"on 1c and Oft, the given,fU'Mtion F, but is i1ulependemo!n aM z.

Remark. For 1c == 3 this theorem shows that, as soon &8 fla isfinite and condition (0) is satisfied, we have

MIi.'f" (x) -lP (x) I<Vn '

where M is independent of ,.., and ~. Thus the condition (0)enables us to improve the Li&:POunotf limit of the error as- givenby Theorem .24:••

Proof. From Lemma 4 we obtain, using (88) and substitutingatv''' for t in the integral,

1 Cramer (2)-• It is shown in the works of Bueen ud Gned.enko-Kolmogorol' quoted OIl

pap 119 that this improvement holds even without the condition (O).

ASYl\IPTOTIC EXPANSIONS R3

(3:+1&(110) (,)J:: (11 -X)(I)-1 Bk ,. (lI)dll

=0k

(u-(I)'I/,-W/'I. flO If,.(crtv''I/,) Idt+~).J1/(4:crp:/~) tcu+l n('k-2)!J

Given any d.f. F (x) satisfying the condition (C)'l it follows fromthe Remark p. 26 that we can find c> 0 such that f j(t) I< e~ fort> 1/{4up'fk). By (83), however, fit (atvn ) = (/(t))tt, and thus weobtain from (llO)

(Ill)

i(JJ J:Th (y_X)<u-l RIM (lI) dU I < M e:lI

+n-<k-2)/9).

M denotes here, as during the rest of this proof, an unspecifiedquantity depending only on k and on the given function F, butindependent of n, x, It and w.

Now Rkl~ (y) is the differenoe between the never deoreasingk-3

function ~n (g) and the function U (11):= <1l + ~ n-v/2I:,( - tP}.v:=l

The derivative U' (y) obviously satisfies the relation t u' (y) 1< M,80 that we have for every 'Y in the interval of integration

Bien (:1:) - Mh < Rkn (y) < Rkn (x +h) +Mit.

By means of these inequalities, we obtain fronl (111)

Rim (x) <M(h+h-a>;-cn +h-n-<k-2)/i) ,

Bien (x+h) > -M (h + k-a>;-cn +k-<Dn-(k-2)/B).

Replacing in the last inequality x + h by x, we thus have generally

(112) IRkn (x) I< M (h + ~-w~-cn ..;: k-<un -<k-2)/2) .

Taking here k=n-(k-2)/2, w= l/logn, ,ve obtain (109), and tht.'theorem is proved.1

It is easily shown by examples that Theorem 25 does not hold

84 ASYMPTO'!'IO EXPANSIONS

true without the condition (0).. Let, e.g", F (z) be the stepfunction oonnect-ed with the simple Bernoulli distribution (v, is):

{O for x< -p,

F(x)={q H -p~x<q,

II " x~q.

F (z) being of type II (of. m, § I), the condition (0) is obviouslynot satisfied. Taking k =4, Theorem 25 would give the expansion

iJ (x\ ==w (x) + p - q <9(3) (:1:) +0 (!) .n J 3f~npq n

This can, however, not be trueJ as it is readily seen that ~)~ (a')has, in t.he vicinity of x =0, discontinuities where the saltus isof the same order of magnitude as n-l .

Howe.ver. it can be s}lown (Cramer (2], p. 56) that, even withoutcondition (C), all asymptotic expansion of the form given inTheorem 25 holds for an appropriately weighted average of thefunction l1n (x) over any given intervaL In the second memberof {IDS}, we shall then have to hltroduoe the correspondingaverage of (x) &Ild its derivatives~ while the order of the errorterm will only differ by a factor (log n)2 from the order givenby (109).

6. wg

e shall nov," prove an analogue to Theorem 25 for the caseof unequal components. We shall then have to lay down certainconditionswhich, roughlyspeaking, may be interpreted b;y· sayingthat the d.f.'s of the variables X'1 will be required to satisfy thecondition (0) on the average in a certain specified sense.

According to Theoreul 4~ any d.f. F, (z) ma.y be uniquelyrepresented in the form

(113) F,(X) =K,Gr (x)+ (I-It,) Gr (x), (0 ~ I(,.~ 1),

where 01' (x) is a d"f. of type 1 (absolutely continuous), whileOr (x) is a, d.f. \vhioh does not contain any compOllent of type I.We no\\l' proceed to prove the following theorem.

(115)

(114)


Theorem 26. Let Xl' X1t .... be independem vanable8 BUCkthot X r has the it·l_ 1;. (.-c) until the mean. value zero, tke 8.d. ar'

and a finite ab80lute mom,ent Pic' oforder k ~ 3. Let 1;. (x) be repre...aented (JC(JO'fo,ing to (113) ana suppose that tke derivative a; (x) is0/ bolllTUkd total variation v,. i,-n ( - 00, +(0). Suppose further thatwe AafJtlor infinitely increJ1,8i1l{/ n

1 'If, I(

-- ~ -'--+00lognr-=11 +f)~ ,~.. 1 ~ ICr- -- A.A ---+-oo8110 logn"_ll+~ ~

811

ani/, Ph being defined as in tlfe preceding parag'fapkB. For thed.l. if. (~) oj tke tiOrriable. (Xl+... +X."J/8n we tken Mt1e theea:paftlion

witA an. error term Rim (z) satisfying the relation

M(116) IBh (2:) I< Pl;I'

where M is indepe.ndem. 01 n 0,114 #:.

Remark. An important particular case is the case when(4) the conditions (87) are satisfied~ and (b) the variation&t'r areanitormly bounded for all r== 1, 2, .... As we have

~,.,/8ft== B'n/(4B~:J t

the conditions (114:) and (115) are in this case equivalent andreduce to the single condition

1 "-I- 1:: J(1'~C1J.og 7t"llIDl

Ph is in this case of the same order of magnitude as Vn, 80 that

.MIRim (:I:) I<n,fk-f)/I·

For Jete: 3 we obtain here the same improvement of Liapounoff'etheorem as at Theorem 25_

86

(116) becomes

ASYMPTOTIC EXPANSIONS

Proof. :From (91) and (98) we obtain

n-vll~1f,(-~)==ekTh.

This shows that for Tkn ;$ 1 the assertion of the theorem is trivial,so that we may assume throughout the proof Pie,,> 1. FromLemma 4 we obtain, using (83), for 0 < Q) < 1,

wJ:-" (y-:r:)_1Rim (g)tlg=e" (~+p~a).n

where Z = upper bound of II II" (t) I for t> TIm/8n -r-=l

Hence we obtain by the same argument as that used for thededuction of (112)

(k-fJJZ )IRkn (x) I< Elk A+-;- +k-tDTkJ,"-t) -

(For this deduction we require the result that the derivative ofk-8

the function U (t)=4l+ I; n,-v/IPn (-4» satisfies, for 1'1:.> 1, thev-I

relation IU' (t) I< ek - This is easily proved by meaDS of (91)and (98),.)

Taking n= T;Jk-l), w= l/logTh , we now obtain

JRk,n (:1:) f <a. (TkJ,k-t) +Z log Tim).

So far we have made no use ofthe aesumptions (114) and (115).If we can now show that, owing to these assumptions, we Ilavefor every fixed A > 0

M(117) Z< p~'

where M is independent of 11" the theorem will obviously beproved.

ASYMPTOTIO EXPANSIONS 87

By hypothesis we have, denoting by gr (t) the c.f. of G,. (~),

II,. (t) I~ 1<, Ig, (t) I+1 - IC"

&nd Igl'(t)I-I-~f~«>8~dG~(~)I<lirFor 1t I~ 2vr we thus have

iI,. (t) I~ 1- tIC",and hence for f t I< 2v,. by Lemma 1

tt tti fr (t) I~1- (Ie,. - !J(~) 32tJ!~ 1 -1(,. 64tJz-

r r

It follows that we have for all t> 0

11,,(t) I~ I-nKI'Min(l,~),and consequently for t> Trm./8n.

1/1'(t) I~ I--hKr Min ( 1. ;;':;) ~ 1- 6~ l~~Min(l.~)1 It,. Min (1 Pt.):i e- 6i 1+f); , < ,

.. I :Min I.. P:ft ) i 1(,.n f Itt(t) I~ e- 64 \ 1, s; r_:t 1 wf ~.,-".1

According to (114) and (115) the last expression is, however,for any fixed A > 0 a.nd for all sufficiently large -n less thann-iA.<MTkn~, so that (117) holds true, and the theorem isproved.

7. It has been proved in tile preceding paragraphs that.subject to certain conditions, the series1

(lOla) ~n (~)=4» (x) +Pin (:-cI» +P21l ( -$)+PSn, (;-4» + '"ns- n n

gives an asymptotic expansion ljf tYn (z) for large values of n.According to (95) and (98), the Pvn ( -tI» are for 11= 1, 2, ... , k - 8defined in the following manner. We fu'St define an ordinary

1 The formal definition of this aeries was given by Edgeworth [1].


polynomial P"" (t) by the relationt-l A t~1E .. +J,. %" Ie-a

ev- 1 (v+2)! :=1+ I;~n(t)zP+O(zk-J).v-l

Here, Am denotes the quantity defined by (84), so that n-(v-I)/tA...

is the vth order semi-invariant of ij1\ (x); Jc is an integer such thatthe 1cth order absolute moments are known to be finite for allthe components of ~n (x); and finally z is an auxiliary variablewhich varies in the vicinity of z== o. To obtain Pvn ( -fla) we thenreplace in Pm (t) each power f! by the function (-l)~cb<r)(~). Inthis way we obtain the expressions

~n ( -C») = - ~i cXl<ll) (z),

Ph ( -f1»... ~i 1)(') (z)+ 1O~~ 4)(6) (z).

Ph (-4l)- - ~i~6)(Z) - 35Aa;~,n ~7)(:r)-280~itf)(9)(Z),

for the first terms of the development (lOla). It will be remem...bered that in the case of equal com.ponents the Avn. (and thus aliathe Pvn) are independent of 1t.

On the other hand, 8, development of the type

(1010) ~n(z)=~(x)4- ~i()<ll)(X)+ ~i~~(X)+ •••

has been much used by writers OD mathematical statistiCi (ef.e.g. works by Charlier, Bruns, Gram and Thiele), and it haa beenclaimed (without correct proof) that this expansion shouldpossess asymptotio properties similar to those discu88ed abovefor the expansion (101 a). The coefficients cvn are here determinedby the relation «>

en=( -1)11 f-=HII(z)dg:n(z),

where Hp (z) is the vth Hermite polynomial:~d? -~

H. (z) =(-l)P e i ilz"e 2"


From these expressions we obtain, by m.eans of the relationsbetween moments and semi...invariants (IV, §2).

;\3ft.Can =- n i '

A4nc4n =n-

AanCsn = - nfj.

;\6~ lOAl~n! + -;'

• f; ""'''''oO'' to" '1'''' • 'If c.,.,

For larger valu~$ of n, the expressions of the ~n and the om

become inCl:e&SL."t1gly complex, but it will be seen from the &bt1Vethat the two expansions (lOla) and (lOlb) may be regarded a$rearrangements of one anoi~her Jt followa from Ol.Ir th~orem8

tha.t It is only (lOla) which gives in tIle ordinary seflse, anasymptotic expansion of tjn (x). Or1 the ottu;r hand~ tIlE' expan...sion (Ii}} b) ma)' be considered a.~ formally tlirnpler, sin~ the CY'~t.

art' de1in~rl. by th~ simple relatlon givCll n~ove~ ,vhich le!1ts on theorthogonalitJ prop~rt!es of the Herluite J:}Olynomials 1

1 Fo! a. more det&tltd anal,; Sl~ of the relat.olls betv.een the two t)'~"1S vf t'xpanmons cf. Cramer (2]

o

CHAPTER VIII

A CLASS OF STOCHASTIC PROCESSES

Z =Z +D:'Tl+Ts 1"1 '1'1 Ta'

where Z'TI and [~1 'TI are independent.It is, in faot, possible to give an exact meaning to the limit

passage which has thus been roughly indioated.. We shall, however, prefer to consider directly a random variable which depends

1. In the preceding Chapters, we have been ooncerned withdistributions of sums of the type Zn == Xl+ ~ .. +X"" where theX,. are independent mndom variables. Z,.. is then a variabledepending on a discontinuous parameter fi., and the passage fromZn to Zn-rl means that Zn, receives the additive contributionX 11+1' so that we have Z-n+l = Z.,. +Xn+1, where Zn and X..+1 areindependent.

Consider now the formation of Zn by successive addition ofthe mutuall~y' indepe...l.dent contributions Xl) X" ... , and let usassume that each addition of :& new contribution takes a. finitetime S.. (In a concrete interpretation the X, might e.g. be thegains of a certaJ.n player during a series of ga.mes, every gamerequiring the time S, so that Zit-is the total gain realized after 1Ir

games, or aftser the time n.8.)The sum ZtJ, then arises after the time n8, and the d.f. of Z. is

thus the d.f.. of the sum that has been formed during the timeinterval (0, nO).. Suppose now that we allow 8 to tend to zero and1£ to tend to infinity, in such a way that nO tends to a finite limit 'T.

It is conceivable that the distribution of Zn may then tend to adefinite limit, whioh will depend on the oo1l-tinuous ti1M paramt.ter7. Thus instead ofthe variable Z-n, with a discontinuous parametern we should have a variable Z.,. with 8 continuous parameter 'Tt andluch that the increment of ZT dUring the time interval (-Tt, 1"1 +Tt)

is independent of Z'rl:

(118)

A CLA.SS OF STOOHASTIC PROCl1SSES 91

on & continuous parameter and which behaves in the general waydescribed above.1

2. Let T be &. continuous parameter which may be thought ofas representing time. Suppose that, for every 1"~ 0, we have arandom variable Z-r with the d.f. F (~, T) and the c"f..

I(t, T) = f~Q)e#eclDO'F (:t, T).

ZO will be supposed to be identically equal to zero, so th&tF(x,O) coinoides with the d.f. E(X) defined by (17).

The set of variables Z.,.. will be said to define a random. or8t00ka8tic proces8 with 8tationary and indepentlettt itu;remems(briefly: a 8.t.i. procea8) if, for 'T1 ~ 0, "-1> 0, the differenceU'T11'a = Z1'l+'T1 - Z'Tl is a random variable which is inilepefUkn;t 01tke variable Z'Tl and has a d.f. which is i'llilepenilent oj 71- We oanthen say that the inorement of the variable ZT during any timeinterval is independent of the value assumed by the variable atthe beginning of the interval, and also independent of the posi.tion of the interval on the time scale (but not, of course, independent of the length of the interval).

If ZT defines 8, s.i.i. process, it is seen from (lIS) that the dlOf.of Z1"l+T t is composed by the d ..f.'s ofZ"rl and U"l'1 7'. The latter d.f.is, however, by hypothesis independent of 71' and for '1"1 =0 wehave 0'0,'7'1 =Z,;s-ZO=Z"'a' so that the d ..f. of UrI"'. is identical withF (x, Tj). This gives us the following relations which may serveas an analytical definition of the s.i.i. process:

(119) F (x, Tl+72) = F (x, Tl)*F (z, T:a),

(120) f (t, 'T1 +Ta) =/(t, 71)! (t, Tt).

1 Particular cases of variables of this character were first studied by Baehe1ier(1, 2) and Lundberg [1, 2]. Further contributions were given i-nter alia, by Cramer[3] Stud Esscher [1], in conneotion with the mathematical theory of insuran.ce risk.A complete and mathematically rigorous theory. which embnces &lso cases muchmore general than the s..i.i. process, was first given by Kohnc.goroff [2]. The theoryof the s..i ..i. process was developed by Uvy [2] under more general conditions th&nthose considered here.

92 .A CI.,ASS OF STO(.HASTIC PROCE8SJ:~

Fo!" the mome:n::s &f Z'f' we ~hal; use the nutationf'J...:;;

C(p(r)=E(Z;)= f x"dF(x,~).,.. -0;)

(Throughout the C;hapte:r lt Vr~l be u.nd~rBtood that the variableofintegration 18 alw-ays thefir8l varlable occurring in. the functionbe~hind the tugn d. so Lhat we rnay Olnit the index or! this sign.)

Theorerrl 2i~! Let Z"f defi'M a 8.,i..i" ?ooea$~ 8twh tJw,t at (1")=0(JfJta, :12 (T) ~8 J./fh,ite j(y.r all -"f' > 0 Q We then have

:0.. .. ~ ... .. f«} e'l~ - 1-- itz ..(l~iJ JOg/{t~T)= -iUOTt2+7 -~- --d!!(x),

-¢c xw.ag ~ visC (()jrstant ~ andn (x) irs {it i 4('i.unded aruJ nevtrdeCrta8ingjunction1JJh.irJit i8c,ominu0'U8 at x ::.= o. G01lf..'eraely ~ git'en anyCA'J'Mtant~~ 0 and any brru/lu.led and neve:r ile.c1'easing functio}~ n (x) conti'A'U0'U8 at x= 0, (121) defines t1~ (;.f. f (t, -r) ~f Q, fJariable Z... c<Jrre.8'jll11t;ding to a a.i.i. pr0CS88.

Before proceeding to the proof of this tbeorerrt, 'lte shall con...sider some simple particular cases Suppose futBt t/hat n(x)reduces to a oonstant, so that the last term In the second memberof (121) disappears. Then it follows froID (121) 1hat,

F (x, '1") =cJ) (xf(OO'V/7)),

80 that Z'T is, for every ". > 0, normally distributed witil the meanvaJue 0 a,nd the s.d.. O"OVT» This case is often called the BrfYWnianmovement prOCe88, a name referring to one of its importantphysieal applications. Suppose on th~ other hand ao = 0 ando (x) = AC2E (x-c), where ,\>0 and c:pO are COllstants, and £(z)is defined by (17). Then (121) gIVes

logf(t, T} =~T (eM _.. 1 -- cit),

t Kolmogoroff{&l. aI, also de Flnf"tti ll~ 2] lfths hypothefWl.xl (.,)-Oiaomitted.~e may apply the theorem to the varia.ble Z., ""«'J (1") .. and choose for ft.l (-r) any:Mlutlon \eontinllt')us or not) of the funCttional equation

«1 (1'] +"f'%) ~(X1 (1",.) +~\ (Tt).

If we uaume, e.g., that, !Xl (-r) 1C bott~ m some mterval, however small, we neoessanly have tXt (,.1 :=~t where e is & real OOI1It&nt.. Levy {2] atudies the s.i.i. 'P~mthu"Ut assumill$( the eXlItenee orfinit~momenta 0'.1 (,.) and. «a (1').

A. OLASS OF STOOHA.STIC PROCESSES 93

80 that the variable ZT+ACT has the a.f.eA"'(~-l).

According to (47) this corresponds to 8, distribution ofthe Poissontype~ The corresponding process, which has important applications, e.g. in the theory ofinsurance risk, is known as the PoiaB,.pr0WJ8.

More generally, let (1 (x) be a step-function with a finite num-ber of steps, none of whioh is situated at the point :lJ == 0) and put

b=J~00r 1tID (2:). Then it follows from (121) that the distribution

of the variable ZT+bT may be regarded as composed of onenormal component (arising from the term containing (To) a:n,d anumber of independent Poisson distributions, each of whichcorresponds to one step ofn (x).

In the general case~ the distribution of Z.,. is always composedof the normal component ~ (:e/(ao\IT) and another oomponentcorresponding to the term containing (1 (x) in (121) ..

\Ve now proceed to the proof of Theorem 27.. Let us firstconsider the s.d. Vi.Xt(T). From the fundamental relatlons (119)iand (120) it follows that we have

«! (~1+Ta):= (.(2 (Tl) + tX, (TI)·

The only non-negative solution of this functional equation is,however.. l

(122) «.s (T) == aar,where 0'2 ~ 0 is a constantt From (122) we deduce

(123) J(t, ~'T) = 1- i&altttlTwith 1.& t ~ 1, so that !(t, L\T)-->-l as AT~O. According to (120)it then follows that, for every fixed t, f (t, 7) is & oontinuousfunction of T ..

From (120) we obtain further f(t, l/n)={!(t, l)}l/1t, and hencefor all rational mIn we havef(t, mIn) ={f(t, l)}m/~ .. By continuitythis result extends immediately to all 'T > 0, 80 that we havegenerally

1 Oi.. Hamel (1]" Hauadorti [1], p. 17;;.

(125)

94 A OLASS OF STOCHASTIC PBOCJESSES

(124) f(t,T)={f(t,l)}T_

According to (123), the expression

f(t, t1T)-l_ {f(t, 1)}QT-ldT - d7

is, for every fixed t, bounded as ~T-+-O. It follows thatf(t, 1).,&0for all real t, and thus the expression (125) converges uniformlyin every :finite t-interval to the limit

(126) lim j(t, ~'T)-1 =logj(t, 1),AT-+-O T

where logJ(t, 1) denotes that branch of the multi-valued fnnctionwhich vanishes for t:= 0 and is for all real t uniquely determinedby continuity.

On the other hand we have

Putting

(128)

H (z, A.T) is a never decreasing function of x suoh that

H (-C1J,AT) =0, H (+00, 6:r) =0'2.

For every fixed AT>O, H(x,Ar) is continuous at x=O, andwe have

1 foc fGO e*-1- itxA (eitz - 1-i~)dF (x, .aT)= i dB (x, A:T),~T _~ _~ x

where, for x=O, (e1h -l-itx)jz" is to be interpreted as -tll/!.According to (124), (126) and (127) we thus obtain

fcc e'ib: -l-itx(129) log! (t, T) = T lim 2 dB (x, 1ia'T).

A'T-+-O -00 x

Consider now the function H (x, aT) for a sequence of valuesa1'r, L\eT) ... tending to zero. It is then always possible to choosea sub-sequence Ant T, LlntT, ••• such that the oorresponding fune ...

A OLASS OF STOCHASTIO PROOBSSBS 90

tiona B (z, ~n tT) tend to & limit H (3:), in &ll oontinuity points :eof the latter. From (129) we then obtain

fco e1l:J:-l-itx(130) logf(t, 1') =='1' 2 dB (:r:).

-00 zObviously H (x) is a never decreasing function such that

H(-co)~O, H(+oo)~a2.

We can, however, show that in both these relations the sign ofequality must hold. We obtain in fact from (130) for small values

oft logf(t,'T)= - t'Tt2 {H (+co)-H( -oo)} +0 (tJ) ,

but on the other hand (122) gives

logf(t,T)= -iat.rt2 +o(tI),

so that we must have

B (-00)=0, H( +00)=(12.

Let, now, O'~ denote the saltus of H (x) at the point :£=0 (thuso~ (1~ ~ (12) and put

(131) n (z)=B (~)-aJE(X),

«:(x) being defined by (17). Then we have

fi( -00)==0, n (+00) =oi= al-aJ.Further, n (%) is bounded, never decreasing and continuous atz=O, and (121) follows immediately from (130), so that the :firstpart of the theorem. is proved..

The latter part of the theorem is obvious in the particular casewhen n (x) is a step-function with a finite number of steps.(Of. the remarks made above.) Further, ifn(z) is any functionsatisfying the conditions of the theorem, the second member of(121) may be uniformly approximated by means of a, sequence ofstep-functions converging to the limit n (x). By Theorem 11, thecorresponding dllf.'s tend to a limit which is itself a d.f., and thesecond member of (121) is equal to the logarithm of the c.f. ofthis limit. Thus (121) determines uniquely a d.f. F(z,-r), and itfollows immediately from the form of (121) that the fundamental

96 A CLA.SS OF STOOHASTIC PROCESSES

relations (119) and (120) are satisfied, 80 that the~ proof ofTheorem 27 is completed.

Since (XS (1') is finite, (130) may be twice differentiated withrespect t.o t, and we obtain

fa>e1JzdH (~)= -~ :;'logj(t, T).

But H (:e)/ut is It d.f. whioh is tlniquely determined by its a.f.It follows that we must reach the same limit H (z) for everysequence Al T, 4,.", a«. tending to zero. This implies, however'I thatwe have lim H (z, AT) =H (x) in every continuity point of H (x).

4..1->-0

This leads to an interesting interpretation of Theorem 27. Forx< 0, we have by (128) and (131) in every continuity point ofa (x)~ as AT40,

Ji'(~7~'T)== I~«>tllI ~~.,.)~f:<x>dn~E}="1 (X),

and for ~>O,

~ -F (x, AT) =f«>~B U, A'r}-ioJ"'OCdOU)=-: n.(~).A:" :z; es :e es

This may be MittenF (x, 41")= fit (2:) 6.1"+0 (6.1"), (z< 0',

I-F (x, AT) = fl. (z) dT+O (AT), (z> 0),The probability that, during the infinitely small time AT, 8,

variation < x < 0 ocours in the value of the variable Z.,· is thusasymptotically equal to fil (x) AT, while the probability of avariation > x> 0 is asymptotically equal to il. (2:) A-r ~

Thus the function (1 (x) determines the discontinuous part ofthe variation ofZ,., while obviously the constant Godetermines theCO'TdinUC1U8 pan..1 Further we have

J"'O :r:2dnt (;t}+f«lxii dilt (~) I_fflO dO (a:) =ai~-~ 0 -~

as (r)="=o1"+o1"1',1 It should be noted that the d..f" F (~t 1') is aJway. continllOUi with respect to T,

although tb~ variable Z1' mI.,. 8uifer diIconiinuoua :Jha.npI of value, if Q (~) is notidentie&1ly zero.

A CLASS OF STOCHASTIC PROCESSES 97

S9 '&hat the variance «t ('t) of ZT is the sum ofone term due to theogfltinuous part of the variation and one term due to the dis-continuous part.

3.. By means of the remarks made in § 1,. it will be easilyunderstood that- the s..Li.. process, as defined in §2, presents &,

great analogy with the "case of equal components" in theproblem ofaddition of independent variables treated in ChaptersVI-VII.! Roughly speaking, we are here concemed not with &

8vm, but with an integral" the elements of which are independentrandom variables (cf. Levy [2]).

It, is tl1en fairly obvioUiS that our ~previous theorems bearing onthe case ofequal coolponents, ~uch as Theorems 20 and 25, sllouldhol•..t mU~f1.li8 it1/u.f,a.rulis, also for the case of a a.Lie process.. Infaot~ the 'Variable Z,./(uv~) with the d.f.

~ (z) T) = F (u.'t ,/7",,-r)

and the a.f. f(t,T)=J(tj(av'r),T)

is directly analogous to the previously considered variable(X1 -s- ~ .. + Xfl)!(ay'n) with the d.f. ti_ (x) and the c.f. fn (t).Instead of the discontinuous parameter n, we are here concernedwith the continuous parameter T.

The relation (121) may be written

fQj eit%-1- itx- t (itX)2logj{t,T)== -ia'rt2+,. I tlO{:t}.

-ex;) xSubstituting here t/(t1VT) for t, we obtai11

(132) .,",co Uoe itx 1 l ike \ 2

t2 J eav'I"-I- ay'T-2\aV'T)logf(t,T)=--2+ T 2 -·-d11 (:r).

-eo x

1 If we omit the condition laid down at the beglnning of § 2 that the distributionof the increase Z"':t.+'" - Z"l should be independent of 1"ilt we arrive at a mare generalkind of random process related to the general problem of addition of independentvariables in the same way as the process here considered: is related to the particularcue of equal components. Subject to appropri&te conditions, Theorems 27-30 canbe generalized to this caN. (For &t generalization of Theorem 21 along these linesof. Levy [21, who considers also the case when at, (or) i' not finite.)

(183)

98 A OLASS OF STOCHASTIC PROCESSES

In a way which is closely similar to the proof of Theorem 20, it isnow easily shown that the 1&st term of this expression tends tozero as 1'"-700, uniformly in every finite t-interval. We thus havethe following theorem directly analogous to Theorem 20.

Theorem 28.1 .A8T-+OO,tked,.J.~ (x) T) ojtkemriableZ.,./(aVT)tends to tke 'JWN)'l,Q], Ju/nction <1l (3:).

In order to obtain an asymptotic expansion of g. (z, 7) forlarge valuesof'T~ analogous to the expansion given by Theorem 25,we shall suppose henceforth that there is an integer k ~ 3, suchthat the absolute moment of order k- 2 of the function Q (x)occurring in (121) is finite. We put for v=3, ... )k

1 fcoAv=-; zv-st1Q (z),a -co

1Icop)l=-; IzIV-2dO(x),a -eo

VTTm=4 Ilk·

Pk

These notations are analogous to those introduced in VII, §3, by(84) and (88). We can now prove the following lemmas, whichare directly analogous to Lemmas 2 and 3.

Lemma 5. For It I~ iYP/w we havet

lk-S P.. (·t) e

ejf(t,T)==l+ ~ vv~ +Tk~2(ftfk+JtI3(k-2»),p-1 7" k-r

tJfn..ette~ (it) is the polynomial of degree 3v in (it), whick ill obtaineJlby f'eplaci'n{} in the polyn,om,ial P1m (it) 0/ Lem'flUJ, 2 the fJ.'UCIntitiuAvn defined by (84) by the quantities Av defined by (133).

Lemma 6. For It I~ TkT we M/vet'

If(t,T)I~e-3.

The proofs of these lemmas, which are based on the relation132), are so closely similar to the proofs ofLemmas .2 and 3 that

:4 Levy [2J.

A CLA.SS OJ' STOCHASTIC PBOCB88BS 99

they need not be explicitly given here. Finally, putting in analogyto (101)

k-3 E (-cz,)(134) tf (x, T) =cJ) (x) + ~ "T*/I + Rk (:v, T)

v-I

where P3i1-1 (x) is a polynomial ofdegree 3v-l in~, independentof "T, we obtain in the same way as in VII, §3 the following fundamental Lemma corresponding to Lemma 4.

Lemma 7. For o<w< 1, we havejor all real a: and all k> 0

(U f:+A (1/-X)oo-1 RiIl(1f.T)dy=ek(f:~Ift~~:) Idt+ T~li)'

11 the 'ntegral in the second, member of this relation is CfYR,vergentlor w = 0, we .furtM'r have

(f~ 1f{t,T) I 1 )Ric (x, 'T') == ail: TM t dt +Tt"l! ·

Proceeding in the same way as in VII, §§ 4--5, we can now useLemma 7 to obtain information as to the behaviour of tv (x, 'T) forlarge values ofT. In the first place, we have the following theorem,the proof of whioh is direotly analogous to that of Theorem 24&nd need not be giverl here.

Theorem :29. 1/ the quantity Pa defined by (133) i8 finite,weMve

log..,Ii.J (x, 1') -41 (x)! < OPa y''T '

where 0 is an ab80lute conetant.Further, we can now prove the following theorem which gives

an asymptotic ex~nsion of ty (:e, 1") analogous to th&t~ obtainedin Theorem 25~

Theorem 30.1 Suppose that the variable Z., cO'/l.8iAlered inT1u?Prem 27 8lJtiaftea thefoUou~ing conditions :

1 Cramer (4).

100 A CLASS OJ' STOOHASTIC PROOESSEe

(1) PIN ah80lute fIU'Jme1&t Pi OIl clefiw by (133) is finite for 80me

integer Ie~ 3;

(II) For SOOJ,8 T > 0, the d.l. F (%, ,,) 8tJ4Vftea the etmrlitsOfl, (0)ofvn, §5.

For the il.!. it (~, T) 01 the Vf.W'ia,ble Z,./(ay'r), we the"" have theezpo/nttion (134) toith

(135) IRj: (x, T) 1< -r'-~,M being independent oj T aM z.

Further, afl,1/ of the JoUowing c<YIII1itic'M (IIa) and (lIb) t.8Ujficie:nt for tAe validity 01 (n) ..

(na) 0-1>0,-(lIb) n (x) =01 (z) +0. (~)t where 0 1 (:c) aM 0. (z) are both

never tlecretUj,ng, wkiJe {}1 (:t) 1,8 ob8olutely continuous (J/nd, tlou not"ed1JlC8. 'W a tXmBtant.

If (II) is satisfied for a single 1">0, it follows from (121) thatthe same thing holds for every T> 0, and thus in particular forT= 1. From Lemma 7 we obtain according to (124)

r.c+hw J.c (y-x)-lB~(y,T)d1/

( i CC> If (t) 1) I'" pI )= E)Ie ('f-t.lJ.r-t»ll tQJ+l til+ 7._1)/1 ,

1/('''pJI~) T

which corresponds to (110). By me&ns of the condition (0) wethen obtain

jw J:+1s (lI- X)-1BA;(Y,T)d,1 <M(S:+.,-<»-I)'I),and the rest of the proof of (135) is perfectly similar to the proofof Theorem 25. The last part of the theorem is easily proved byconsidering tIle real part of log!(t: T) according to (121).

THIRD PART

DISTRIBUTIONS IN RI;

The object of this Part is to sho,v that many of the resultsobtained above for distributions in a one-dimensional space canbe generalized to any number of dimensions. We shall, in themain, restriot ourselves to a brief discussion of some typicalgen~ationsof this kind.

CHAPTER IXI

GENER.~L PROPERTIES.CHARACTERISTIC FUNCTIONS

1. For a distribution in a one-dimensional space, the onlypossible discontinuities arise from discrete points which~ in termsof the meohanical interpretation used in Chapter II, are bearersof positive quantities of mass. As soon as the number of dimensions exceeds unity, the question of the discontinuities becomes,howe~"'er) more cor1'lplicated. Thus in a lc-dimensiona.l space, thewhole mass may be conc.entrated to a SUb-space of less than Jcdimensions (line, surface, .... ), though there is no single point thatearries &, positive quantity of mass.

Given a random variable X = (El' ... , ek) in the k-dimensionalspace Rk , we denote as in Chapter II the corresponding pr.f. byP(S) and the d.f~ by F (Xl' ou,Xk). Just as in the case lc== 1, therecan at most be a finite nun'lber ofpoints A such that P (A) > a > 0,and hence at most an eIlu.merable set of points B such thatP (B) > o. We Sohal] call this set the point spectrum of the distributioD.

1 The geueraJ theory of completely additive set functions in a hedimensionaJ spacehaa been developed by Radon [11, Bochner [2] and. Ha.viland [1,2.3]. .A compre..helsive account of the principal results of the theory is giv~nby Jessen..Wintner [1J.

102 GENERAL PROPERTIES

According to II, § 3, every component 'i of X is itself a randomvariable, and the corresponding (one·dim.ensional) distributionis found by projecting the original distribution on the axis of ~i.

Let Qi be the set of real numbers which are discontinuities ofthe distribution of fi' and form the (at most enumerable) setQ= Q1+... +Qt. Further) let J denote a lc-dimensional interval

ai<'i~bi'

and consider the probability P(J) of the ','Oevent" X cJ as 8,

function of the variables al and hI.. It is then obvious that, aslong as no fJi and no bi. belong to the set Q, P (J) is a ~i1t/UOU8

function of these variables.Any interval J such that no Q,i 1l,nd no hi belongs to Q will be

called a, continluity intervaZ ofthe distribution. Iftwo distrihutionscoincide for every interval whioh is a continuity interval for bothdistributions, it follows from Theorem 2 that the correspondingd.f. '8 are al,vays equal, and thus by the same theorem thedistributions are identical..

If. a sequence of pr,f.'s {Pn (S)} converges to a completelyadditive set function P* (8) in ev"'ery continuity interval of thelatter, we shall say simply that {Pn, (8)} confJergea to p. (8). ThesymbolPn. (8)~P* (8) will be used only in this sense. Fromeverysequence {Pn, (S)} it is possiblel to choose a Bub-sequence whiohconverges in this way to a limit p* (8). Obviously we cannot illgetteral assert that p* (8) is a probability function, 88 we onlyknow that 0~ p* (R1:) ~ I.

Any pr..f. can always2 (cf. Theorem 4) be uniquely representedas a sum of three components

(136) P (8) =aIP1 (8)+all111 (8) +aIII~II(8),where Ox, all' alII are non-negative numbers with the sum 1,while~, l1x, PIlI are pr.f.'s such that

ll(S) is absolutely continuous; .&(S)=JsD(X)tlX. where

1 Radon [1]. This is proved in practically the same way as in the one..dimenaioDalcue

:': Radon (t]..

GENERAL PROPBRTIES 103

D(X) is a, non-negative point function in Rk which is called theprobability de'lt8ity or densitY!'Ullldion of the distribution definedby 1\ (8).

Ii:I (8, :is pnrely discontinuous; ~I (8) = 1 if S coincides withthe point spectrunl of P (8).

1\11 (8) is "singular"; the point spectrum of ~II (8) is emptyand there existsa Borel set S ofmeasure zero such that~II (8) == 1.

2. A real-valued function g (X) which is :finite and uniquelydefined for all pointsl of Bit is, according to II, § 3, a randomvariable with a uniquely defined one-dimensional distribution.By (I5a) we have for the mean value of this variable the expres-

~on fE(g(X»= g(X)d.P,B.t

subject to the condition that the integral isabsolutely convergent.The mean values of the particular functions

g (X) =fit ... ~t1t, (Vi =0,1,2, ... )

are called the '1fI..011ten:t8 of the distribution. We shall use thenotations m( = E (fi ),

1-£'£;= E «gi-mt) <fj - mJ)),

~ == D2 (et) = fLii = E «ei - mi)I).

Putting rij = f-Lii ,Ui(7j

it is thell ea.sily shown that we have - 1~ riJ ~ 1, and that theextreme values 'ti1 = ±1 can only be reaohed if, in the twodimensional distriblltion of the H combined" variable (E" f;),the whole mass is situated on one of the straight lines

<fi - mi)/t:1f,= ± (,;-m;)/o-J-

"; is called the ooeJficient oj correlatiO'lt between '1. and fs' andplays all important part in the statistical applications.

More generall)', he (!uadTat,ic forr.n

I; J.-t,;; u~U l ::= j~ (~Ut~i)2dPf.t>t;'" .. I4: i

1 Ex("ept posaibly fur c~rt~il pl)intB formir)g a $ftt :E such thn.t P (1:) =0..

104 GBNERAL PROPERTIES

18 never negative, which implies that the determinant II f'illl, aswen as all its principal minors, is ~ o.

3. The c1l4ract,eriBticfu'l"ldion ofa distribution in Rk is the meanvalue

(137) f(t1J ... ,t't):aE (ei (ltil+...+lktl»)==j e'i.(llfl+...-+tl;fk>dP.BI:

Unles~ explioitly stated. otherwise, the t, will be considered asreal variables, so that I may be considered as a function of thereal point (tx, •••, tic) in R,,_

ObviouslyJ is a uniformly continuous function of the t, in thewhole space, and we have always IJ I~ 1. The generalization ofTheorems 6-8 to any nutnber of dimensions is comparativelyeasy, and will not be de&lt with here.

If all moments np to a certain order are finite, we have forsmall values of It, I an expansion of,' analogous to (25). If, inparticular, all #L1,1 are finite and all m'l are equal to zero, we have

(138) J(tl~ .•• ,t/e)= 1-1 LP.iJt,tJ+Q(~t~).iti ,.

We shall now consider the generalization of Theorem 9, andfor the sake of formal simplicity we take first the case of a two"dimensional space R2• The generalized theorem will then be asfollows: If the interval J defined by

Xl < '1 ~Xt+'1tt, 2:. < t, ~Zt+1I,2is a continuity interval of the distribution, we have

(139) P (J)== F (xl ..~kl'&C1 +ka) - F (Xl +k1) XI)

-1' (2:1' xa+ha )+F (Xl' XI)

1 JT IT l-e-itlhll-e-Uahs= lim A-a 't,. 't e-'(/l:Z:l"tll~f(t1• ta)dt1dta•

7'-+- t.O "S'H -7,1 -'1' ~ Z 2.

We have, in fact, for the quantity behind the sign lim. , the'I'.-,..a:>

expression

GENERA.L PROPERTIES 105

where,p. ('l) =!JTsin t (f, -,xtl)dt- !f2"Sint ('i -: , - hi)dt.

t 11 0 t 11" 0 t

As T --+ CLJ, the product "1 eel),pl <ft) tends to unity for everyinterior point (~1' e,J of J, and to zero for every point (~l' '2)outside J ..

It is then obvious that the proof of (139) can be performed byan easy extension ofthe argument used in the proofofTheorem 9"As in the case ofTheorem 9, we deduce immediately the corollarythat a two·dimensional d.f. is uniquely determined by its c.f.

It is clear that the argument is perfectly general, so that \\"emay state the following theorem.

Theorem 9a ..1 If tke k ..dimensioool imerv&l J defined by:Ci<'i~Xi+ht (i=l, 2, ... , k) is f1, continuity interval for tkeprobability function P (S)t we have

_.. __I_JT fT l-e-ilthl t> l-e-i1lkJ..P(J)- lim (2)k .... at ••• ..

p-+-CX) 1T -T -T $ 1 Itk

X e-i(11~1+.....+l1JJ}£)!(tl' ... ,tk)dt1 .... dlA,.

Hence it follows that Q, probability distribution in Rx.. is uniq'ilelydetermined by its characteri8tic function.

4. Before proceeding to further generalizations of a similarkind, we shall 110W introduce a method of induction2 '\vhich calloften be used for the extension of theorems on one-dimensiollaldistributions to any number of dimensions.

Consider a, random variable X = ('1' ... , 'Ie) in Rk ""ith thepr..f. P (8). Let T = (t1 , ••• , tk ) denote a fixed point in Rk such tllatT::I.:(O, ••• ,0)) and consider the one-dimensional random variable

U =ll'l+ ... +tkeA..The c.f. of this variable is by (137)

(140) E (eUU) =E (eit(tlEl+••• +l~ ~!») = f (U1, .... , Uk).

This is &, relation between the c.f.. of a k-dilnensional variable X

H

1 Romanovaky [1], Haviland [3]. 2 Cramer-Wold (11.

106 GENERAL PROPERTLi£:S

and the c.f. of a certain associated one-dImensional variable USince both t alld t1, .... , tk; are arbitrary, it will in many cases bepossible to use this relation for the purpose in vie,,".

Denoting by 81:',:£ the half-apace defined by the inequality

(141) U =t1e1+ ... +tkek~X,we observe that P (ST, x), considered as a function of the realvariable x IS the d.f. of the random variable [I. From (140), wenow obtain in the first place the following theorem, the one·dimensional case of which is, of course, trivial.

Theorem 31.1 If two probability functions in Rk coincide Jorevery half-space ST,:!:' they are identical.

In order to prove this theorem it is sufficient to remark that,by hypothesis, the associated variable [T has one and the samepr.f. in both cases.. Thus in the relation (140) the first member,being the c.f. of U, has the same value in both cases. Puttingt= 1, it then follows that the c.f.'s of both distributions coincidefor T#(O, .... ,O). For T=(O, .... ,O), both c.f.'s assume thevalue 1. Thus the c.f.. 'sare always equal, and then by Theorem 9a

the corresponding distributions are identicaL

5 .. We now proceed to the generalization of the importantTheorem 11.

Theorem l1a.2 Let {P1l, (S)} be a sequence of pr.. f.'8 in Rk,and {In (t1, ... , tk)} the corresponding sequence of C.J.'8. A nece8sarya1UJ, 8ujficit:iu condition for the convergence of {~(8)} to a pr.f.P (8), in every continuity interval 0/ the latter, 18 that the 8equence{In (t1, .... , tk )} OOfl,'Vergesfor every T =(t1, ••• , tIc) to a limitf(t1, ••. , tk ),

which is contint«1U8 at tke point T =(0, .... , 0) ..When. this condition i8 8ati,8jied, tke limit! (t1, ..... , tit) $a identical

with the c.f. oj P (8), and {!"'} cO'n/vef'gea to J uniformly i" 61Jt'1r1Jfinite interval..

That the condition is necessary is proved by a straightfor\vard

1 Cramer...\\,,'oJ.d [1J.2 Romanovlky (1]. Bochner [2]. H&riland [3], Cramer.Wold [11..

G·ENERAL PROPERTIES 107

generalization of the argument used ill the olle-dimel1sional case..It thus only relnaills (cf. the })roof of Theoreul 11) to prove that,if fll. (t1, ••• , tk ) cOllverges to a limit J* (/1, .... , tk ), uniformly inItt I<a, then P,t (S)~P(S), where P(S) is a IJr.f.

Let T= (tl , •.. , tA) be a given point in Rk sucah that T# (0, ... ,0),and consider the sequence In (ttl' ...., tl1&)' where t is a real varia.ble. By hypothesis this converges for all t to a limit, which iscontinuous at 1=0. According to the preceding paragrapb...fit (ttl' ... , ttk ) is, however, the c.f. of the d.f. Pn (ST.:.c). Thus byTheorem 11 we have Pn. (STllx)~FT(X) ill every continuity pointof Fp (x), where Fp (x) is a d.f.

From {.F:t (S)}, we now choose a sub-sequence ,vhich convergesto a limit P* (8), in every continuity interval of the latter. Thenit follows from the above that, in every continuity point ofFT(x). we have P* (ST,x)=F7" (x). Allowing here x to tend toinfinity, it follows that P* (Rk ) = 1, and thus p* (8) is a pr.f.,which ,ve denote by P(8).

In exactly the same way as in the proof of Theorem 11 we C&11

no", show (using, of course, Theorem 9a instead of Theorem 9)that every convergent sub-sequence of {P,t (S)} converges to thesame limit P (8) .. This is, however, equivalent to the statementthat the sequence {Pn (S)} converges to P (8) .. Thus Theorem 11 ais proved.

We.shall not enter here upon the questioll of a k-dilnensioD&lgeneralization of Theorem 12.

6. Let us consider two mutually independent variables XIand .X2 in Rk • The pr.f.'s will be denoted by PI and P2) and thec..f.'s by11 and!,. respectively.. The sum Xl +XI' formed accordingto the ordinary rule of vector addition, is a k-dimensional vectorfunction of the combined variable (Xl) X t ), and thus accordingto II, § 6 (cf. v, § 1), Xl +X2 is a randoln variable in Rk ) with &

probability <listribution uniquely determined by PI and P2'" Weshall now prove the following theorem, ,vhich corresponds toTheorem 13.

lOS OEYERAL PROPERTIES

rfbeorem 13a.1 11Xl artd X 2 are -ramually irukpe'lUkn,t randommria61e3 in Rk witll, the pr·f.'8 1i and P2-- and ihe c.!.'811 and J2)lien tke 8um Xl+Xa ka8 tke 0./_

(142) f(tl~ •••, tk)=/1 (~, 4lU, t/C)!2 (t1, ... , tk ),

«nt.l tnt pr~f.

(143) P(S)= f li{S-X)d~=f ~(S-X)tl1i,J ltk Be

ttr1lee 8 - X de'i1/Jtea the set ofall poin:t.s I - X, where l belong8 to lJ.As in the one-dimensional case, the relation (142) for the c.f~'s

san immediate consequence ofthe definition ofthe c.f. accordingto (rS7).

Let us no,,,,, consider the secol1d member of (143). For every&m~set8,~ (8 - X) is fit bounded, non...negative and B-measnra"ble function of the point X, so that according to I, §3, theint.egnal al\vays exists.2 Obviously the value P (S) of this integralis a completely additive non..negative function of the set S\v.hi~,~ f.or S = Ric' assumes the value 1'1 Le. a pr.f. If, now, we cansho\v that the c.f. ofP (8). which we denote byJ, is identical withf as given by (14:2), it follows that P (8) must be the pr.f. of thevariable Xl+X 2, and then by reasons of symmetry P (8) willalso he equal to the third member of (143)'1 so that the theoremWIll be proved.

If the set S is a half-space ST,m as defined by (141), it followsii-om. the expression of P (8) as an integral that the one-dimensional d.f" P (ST,:c) is the composition of1i (Sp,z) and~ (ST.:e)' orin th.e notation of v, §1,

P (ST,x) =Ii (STt:J:) *~ (STaz).

i Bochner [2J, Ha.viland [2], Cramer-Weld [1].Z '.Phis may be shown in the following way. If, in particular, S is an interval of

the 'bype 8 Z !fO u,.:tJ: considered in II; § 2, we ha.ve, pntting

X =('1'; ...., is)' PI (S - X) = F1 (Xl - tlf ...., Zk - '1&)'

whm's 1'1 is the d.f. eon'eSponmng to the pr.f. Pl. Thus in this case PI" regarded a8

&.fbuem.orl of ...1', is boundecl,. non.~ga.tive and B ..measurable. From the oomplpte;uldit.ivity of P1 (19 -X) with respect to 8 it then follows that the same propertiesmnst hold for every Borel set S in Rt -

GENERAL PROPERTIES 109

Thus according to Theorem 13 the c.f. of the first member is theproduct of the c.f.'s of both components in the ~condmember,whioh gives according to (140)

]<"1' ... J ttk)=/1 (ttl' .•• , ttTe)/" (ttl' ... ,Uk)·

Putting here t == 1, we obtain the desired result.As in the one-dimensional case, we shall say that P(S) is

composed of the components~ (8) and ~ (/3), and we shall usethe abbreviation

(1430.) P=P1 *P2=~*Pl.

For the sum Xl+Xs+ .... +X n of n mutually independentrandom variables in Rk we have the pr..f.

P =Ii*~*... *~l'and the c.f. /=/1/2···1"..·

CHAPTER X

THE NOR)lAL DISlt RIBUTION ANDTHE CENTRAL LIMIT THEOREM

I. In order to generalize the 1!orma.l distribu.tion to the apaceRk , it is convenient to begin with a discussion ofthe oharacteristicfunctions. FoX" a nOrJnally distributed one-dimensional variable'With th.e mean value m and the S~dll ($, we have,/ according to\"7, §1" the c.f. ! (t) =emu-tctfl"

Aperfectlynatt:ralgeneralizationuf this expret~i*)~\to k va'rIah;esis obtained by putting

(144\ jl( t )_ ifm~4-tr,f/.f',tl'tlcI iI' ".,.." Ie - e I

AssU1Uillg that l(t13 ';.~"~k) as defined by this expression ia thee.f. of it probability distribution in R~1 it is seen by expartSiOll inMacLaurin's series that rar and ftf"B are the fiTat and seoond ordermoment.<;llntrool1ced in IX" §2" We have, in fact lt the followingtbeore.m..

Theorem 32" Fn-t anN real m,. and P.r, =:~. 8'UCh tAat thequ.a.ilralic form ()= 'J;.U$ntrt,. i8 never tt£gative t j(t1.. ....... tic) as

r~,.

de.li:fted by (144) i8 tk.e c.f. of a probability distrib'Uiion in RJ!i u-kickwill be called a n&r1:nnl di8wiJJUtion. 'I'kt!.loUou'ing two CfJ8d fnlJ1j

occur ;;-

tA) I,.f tk.e!('¥Fm QUi bfinite poaitive, the ctYrreapondifl{/ di,ttibu..tion. i06 oo8olYi.elll c.o",,,tinuou-8 and uill be called a proper normald;stributwia. Phe t:leMityJutwtion, of Mia di8triJnJ,tiMf, i8

{I4S) D(Xl ...D('l> .... fJl)- (2:1't)1e~~"';Ae-kli1._.eu.

where 6. = Ii P-r811 > 0andq == :E7if,.- m,.}{f* - m.) i8 tAe reajWooo,l",

ItHm oj QJ with the VtJriable8 f,.-m,.

NORMAL DISTRIBUTION III

and hence

(B) If the form Q i8 only 8emi-dejinite, tke di8tribuJion i8 of theftr;.,g1dar type and will be called an improper ndrmal distribution.For Mia di8tributiO'n, ,he 'llikole 71'UU18 is situated in a certain 811b...space of le88 thalli, k dimensions, d~}'ined by OYtte ~,..,. ~y,ore line4r ~ela

tiOftB bet:ween t,'ke E?' (straight li'fle, plane, ky:p1rplane). Everyimproper normal distribution may be reyreaenk9. ita (he limit oj asequence 01 :JYfoper~l distrib'lltio'P..a.1

In the case (A) we have to show that (145) is a- ;.iensity function.the c.f. of which is identics,! Vtith (144). ConfJider the integral

1 r I;',,"'j'-ig f; ~G (VI' .... J 'U*) =j"9 'Ie t f~J e'1' d~l :. •• d,~1:~

\-7/ V 1?1c

where, until further notic€J 'U1, ... 'I" Uk a~ real variables. Bymean, of the substitution

E,..-.. m, == .L ,.",., ()]s+us),8

we obtain

l;ul"f,-i I:~(gl'-m,)(eS-mB)1" fI',f

== L ttl-rUr + t I: i'l'sUr U, - t ~JLr.TJr'1.,r ~$ ~8

For ur=O ,,~e obtain G= 1, whieh ShOlVS that (145) is a 1ensityfunction. Since both members of (146) are iTltegral fl:tDctionsof the ott?, (146) holds also for complex values of the 'Urc- S~b ....stituting it,. for UfOj G becomes Lha c"f.. cotTeSponding to the

1. The diatintotion of proper aud inlpfoprf normal dIstributions ira. of COJ.I!"8e.relative to the tipace R~ in which the dis~ributiQUS&f'Ie cGnaidered. A proper normaldIstribution in Rle becomes an im{JlI'{)per dia~nhubol1 9.9 i'Oon ItS it ia OOD8!dered ft8 ad·strlbution in OJ spal\e BJr. WIth K> ~'"

112 NORMAL DISTBIBUTION

density function (145), and (146) shows that this is identicalwith (144).

In order to prove the case (B) of Theorem 32, it is sufficient toconsider a sequence of proper normal distributions, allowing theIL" to tend to the coefficients of &, semi-definite fonn Q, while the'n,. are being kept constant_ Obviously the corresponding c.f!sconverge uniformly to a limit of the form (144:) witll a semi..definite Q, and then 'Theorem lla sho\vs that this limit is thec.f. of a certain probability distribution in RI;. The determinanta is, of COU1~e, equal to zero for the limiting distribution, and itthen follows from (145) that the \vhole mass is concentrated inthe set ofpoints defined byl ~ tl" (f,. - mr) (~" - ms)= o. NO\\l" the

r,8

determinant IJ !J.'lI If is zero, so that this relation is equivalent toa certain number le1< Ie of linear relations between tl' ..., 'k-

2. We now proceed to the generalization of Theorems 17-20.From the expression (144) of the e.f., the following theorem isimmediately deduced.

Theorem 17a. The BUm 0/ two independent and 'nonnallydi8tributed tJariablea in Rk i8 it8elf 'lU:ffmQ,lly di8tributed. II at leastone of the comp0nent8 has a proper normal distributio'll" tke Sa111e

kolda true/or the 8um.It should be noted that, of course, the sum may have a proper

normal distribution even if both components ha,,·e improperdistributions, since the Bum of two semi-definite forms Q maywell be a definite form.

If (1 is & positive quantity, and M = (mit .... , mk) is a point inBit, we denote by (8 -.M)/(/ the set ofall points (X -M)/a, wllereX belongs to 8~ IfP (8) is the pr.f. of a random variable X in Rk ,

it is then clear that P (8- M)ja) is the pr.f. of the variableM+aX.

Theorem 18a. Let P(S) be a pr./. in Rk with finite 8econdorder momenta.. 1/, to any points Jfll Jl" in, Ric and any positive

1 In order to &v-oid trivial ditli«'ultiH, we t\StJume here that there is at least onl'minor "'fa #:0.


contJtants at, 0'2' we canfintl M arul C18uch that

P(S ~~l) *P(S ~:2) == p ($:J J

then P (8) 14 a nomuilpr.J.

Theorem 19a. If the sum of two independent vartable8 in R]eis normaU,y diatri1Yated, thm each mriable is itIJelj '1I.O'IWI4Uydistributed.

Theorem 20a. Let P (8) be a p.l- in R1: BUCk tkat the fiTstorder -moments 'In,. are all equal to zero while tke aecond order f1UYI1UmI8

J'r, are finite. If Xl' Xs, ... are indepefUlent variables alllw/ving thepr./. P{S), tken tkepr./. (P(Svn»lt* of the variable

(Xl+ ... +X.)/v'nte'lUls, as fi.~OO, to that normaJ pr.f. whick ka8 tAe sameji"at andsecond order moments a8 P (8).

As in the one-dimensional case, we prove first Theorem 204,from which Theorem 18a is deduced as a corollary. We then proveTheorem 19a by means of the induction method indicated inIX, §4.

The c.f. of (P(8v'n»n. is (!(It'''n" ...,t1clvn)'''. From therelation (138) it then follows, in the sa,me way as in the onedimensional case, that this tends t<> the limit

-1-1: l4r8"

I.,.e ".fas n~ co, uniformly in every finite interval. According toTheorem 11at this proves Theorem 20(1. Hence Theorem l8a isdeduced in a perfectly similar way as in the one-dimensional case.

In order to prove Theorem 19a) we suppose that

P1 (8) *P2 (8) = P (8),

where P (8) is a normal pr.f.) and thus have for the correspondingc..f.'s the relation

(147) flf,. = f =eiL- 1Q,

where L is a linear form and Q a non-negative quadratic form. inthe variables t.,.

114 NORHA.L DISTRIBUTION

Consider now the one-dimensional d.!'.'! ~ (87'.3:)' Pa(ST,:e),P(81.'

t1:t:), where Sp,:}; is the halfwsp&ce defined by (141). By (140)

the corresponding o.for's are 11 (ttt, ... ,"Ie)' la (ttl' ..... ,Uk) andj(tt!, ne,ttk)=eiLl-fOP. Thus P(8r,r.c) is a normal d.f., and sinceaccording to (147) we have

PI (S!l',z)*~ (82",2:) =P (82',=)'

it follows from Theorem 19 that ~ (8f',:c) and Ps (ST,a:) are bothnormal. We thus have

(148) 11 (Ut, ... , ttk ) =eloLi'-1Qli*,

whereL1 and Q1 are functions oftl , "... , tic-It follows from (140) that!1 (ttl' ... , Uk) is the c.f. of the variable

U='l'l+ .... +tk'1c' if we put X1=(fl' ..."k). The nth ordermoment of U is thus a homogeneous polynomial of order n in thet,.. Thu8 in (148)L1 is a linear form and Q1 is a quadratic form inthe tr • Since (148) is a c.f" in~, the form Ql must be non-negative.Putting t= 1 in (148), it then follows that /1 (tl~ e ._, tat) is the Cof.of 8, normal distribution ill Ric" The same holds, of course, forIs (t1) .. • , tk ), and thus Theorem 19a is proved.

3. Theorem 20a constitutes the simplest case of the OentralLimit Theorem for random variables in Rltr It is possible to findalso Ie-dimensional analogues of the more ge:ueral theoremsproved in Chapter VI, §§ 3-6, and of the theorems on asymptotieexpansions, etc. given in Chapters v"lI-VlII. We shall contentourselves here with giving the statement of a theorem whichcorresponds to Theorem 21, though it does not represent acomplete geller&li~a.tion of what has been proved in the onedimewJ10naI case..

Theorem 21a ..1 LeJ, Xl' Xz H be a s.gquence (1 indeptf'..Qe1ltrarulnn" l'flriable.s in Rt 8'UcA t'kal t/t1e'ry X n 'h..aa tIle p; "/~ ~i (B) 'UJ'ithvani8ki,tfI firs' order motnentB and finite second fYl'der m~

1 Theorems ofa .mu1ar kind have been gIven by Bernstem [I]. Ca8telnu.ovo [1]aDd Khintchtne tI,l 3]


~). SUPP08e UuJt, lJ8 110-)- 00, the jollotcing two contliti()f1,8 are8atiBfled:

(149)1 ",- 1: p..Vj~ Ikn, (r, 8 == 1, 2, h.) 1c),"'.,-1

wiers the I'ra are 'nOt all equal, to zero, and

(160) ! i J IX IltlJ:,~O'ft.v-l IXl>~vn

loy werg I: > OJ where IX I denotes viet + +,~.

Then the pr.fe oj the var'£able (Xl + +Xn)/V'n con/verga tothat Wr100Z pr./" v..,lflich has tne fira! artie"" moments zero aM theJeccnu: wder flwrpen't8 fLrs"

"f'hl'S tn~",.=.e.u1 can be pro,,~ed by a direct generalization of theproof of r~ecrem 21, which) of course; requires & little morecalcwa.tu'Jn than In tt...e one·dimensional case t but does not involveany new difficulty of principle. Obviousl)" the condition (150)is analogous to the Lindeberg condition (64). It should be ob~rved that the limitIng distribution nlay well be an impropernonnal di~tribution,viz.. if the corresponding form Q is semiqefi.rlite.. Ob-llously this may occur even in a case wIlen all thefuncl"ions ~,(li) are absolutely continuous.

BIBLIOGRAPHY

The following list contains only works actually referred to in the text.

[1] BA.OHELlER, L. Theorie de 1& speculation. Annalea Ecole Norm. Sup.17 (1900), 21.

[2] BACHELlER, L. OalcttJ, du probabiZitU. Paris, 1912.[1] BERNSTEIN, S. Sur l'extension du theoreme limite du calcul des

probabilites aux sommes de quantites dependantes. Math. An.nalen ..97 (1927), 1-59.

[1] BESICOVITCH, A. S. Alm08t periodic/unctions. Cambridge, 1932.[1] BOCHNER, S. VOf'leBungen iiber FouritJrsche IntBgrale. Leipzig, 1932.(2) BOCHNER, S. Monotone Funktionen, Stieltjessche Integrate und

harmonische Analyse. MalA. Annalen, 108 (1933), 378-410.[1] CANTELLI, F. P. La tendenza ad un limite nel sanzo del cal0010 delle

probabilitA. Rend. Circ. Mat. Palermo, 16 (1916), 191-201.[2] CANTELLI, F. P. Una teoria astratta del calcolo delle probabilitA.

Giom. lIt. ltril. Attuari, 3 (1932), 257-265.[1] CASTELNUOVO.. G. Oalcolo delle probabilitd'O Second 00. Bologna,

1926-28..[1] CRAMEB, H. Das Gesetz von Gauss und die Theorie des Risikos.

l3kand. Aktuaristida1cr. 6 (1923), 209-237.[2] CRAMER, H. On the composition of elementary errors. Bleand•

.Aktuarietid81cr. 11 (1928), 13-74 and 141-180.[3] CRA.MtB., H. On the mathematical theory of risk. Skandia1est8hri/t..

Stockholm, 1930.[4] CRAMER, H. Sur lea proprietes asymptotiques d'une classe de

variables aleatoires. O.R. Acad. Sci. Pam, 201 (1935), 441-4:43.[5] OBA,O:a, H. ttber eine Eigenschaft der normalen Verteilunga..

funktion. Math. Zeit8chrif'. 4:1 (1936), 405-414:.[1] #C:RAMEB, H. and WOLD, H. Some theorems on distribution functions.

Jo'Urn. London Math. Soc. 11 (1936), 290-294.[1] EDGEWOBT.J4.t F .. Y .. The law oferror. Gamb. Phil.. Soc. Proo. 20 (1905),

36-141.[1] ELDERTON, W. P. FF6tJ.fUnc'JJ ctm1e8 and correlation. Second ad.

London, 1927.[1] ESSOHER, F .. On the probability function in the collective theory of

risk. Skand. A1ctuarietid81cr. 15 (1932)~ 175-195~

[1] FELLER, W. t)'ber den zentralen Grenzwertsatz dar Wahrscheinlichkeitsrechnung. Math. Zeit8chriftt 40 (1935), 521-559.

[2] FELLER, w. tJber den zentraJen Grenzwertsatz der Wahrscheinlichkeitsrechnung, II. ll!ath. ZeitBchriJt, 42 (1937).

[1] DB FINETTlfJ B.. Bulle funzioni a incremento aleatorio. Rend. R.Accad. Lincei, (6), 10 (1929). 163-168.

BIBLIOGRAPHY II7

l2] DE FINETTI!I B. La funzioni caratteristiche di leSBtl istantanea. ReM.R. Aceaa. Lince';', (6), 12 (1930), 278-282.

[1] FBEOHET, M. Surlaoonvergenoeenprobabilite. J.Vetrfm, 8 (1980),1.,48.[2] FBECKET, M. Re0h6rche.1J~~. 'l'raiU du calm.JJ £le8

probabilitbJ, par E. Borel_ tome I, fase. 3, Paris, 1937.[1] GLIVENKO, V. SuI teorema limite della teoria delle funzioni carat-

teristiche. Giom. Ist. Ital. Attuari, '1 (1936), 160-167. ~

[1] HAMEL, G. Eine Baais eJler Zahlen Wld die unstetigen Lasungen derFunktionalgleichung!(:t:+1/) = !(a:)+!(y). Math. Anflalen, 60 (1905),459-462.

[1] HA:aDY, G. H.) LI'rrLBwoOD, J. E. and P6LYA~G. lneq'Ufilitie8. Cam..bridge, 1934.

[1] HAUSDORFJ', F .. Menge,nlehre .. Second 00. Berlin..Leipzig, 1927.[1] HAvILAND, E .. K .. On distribution functions and their Lap~..

Fourier transfornls. P'roc. National Acad. Sci. 20 (1934), S6-57. "[2] HAVILAND, E. K. On the theory of absolutely additive distribution

fWl.ctlOns.. ...4mer. 'l.lQurn. Math.. 56 (1934), 625-658.[3] HAVILAND, E.. K. On the inversion formula for Foutier..Stieltjes

transforms in more than one dimension. Amer. ,Journ.. Math .. 57(1935), 94-100 and 382-388.

[1] HOBSON" E. W. The theory oj junctiona oj Q, real t,6Qriable. Vol. I~ third00. 1927, Vol. II, second 00. 1926.

[1] JESSEN, B .. and WINTNER, A. Distribution functions and theRiemann zeta function. 'l.'mnB. Amer. Matk.~Soc. 38 (1936)., 48-88.

fl] KEYNES, J. M. A &reati8s on 1WobabiZity. London, 1921.[1] KHINTCHINE, A. Begriindung der Normalkorrelation nach der

Lindebergschen Methode. Nachr. ForschungBinst. M081cau, 1 (1928).(2] KHINTCHINE, A. Asympt,otiscke Gesetze de,. Wakrscheinlichkeitareoh·

'fI/U11{/.. Berlin, 1933.[3] KKINTCRINE" A. SuI dominio di attrazione della. legge di Gauss.

Giom. 18'. ltal. Attuari, 6 (1935), 378-393.[4] KRINTCHlNE, A. Su una legge dei grandi numeri generalizzata.

Giorn. 18e.. ItaZ. Attuari, 7 (1936), 365-377.[1] KOLMOGO:ROFF, A. Bemerkungen zu meiner Arbeit uttber die

Summen zufilliger GrBssen".. Math. Annalen" 102 (1929), 484-488.[2] KOLMOGOROFF, A. tiber die analytischen Methoden in dar Wahr..

scheinlichkeitsrechnung. Math. Annalen, 104 (1931)" 415-458.[3] KOLMOGOROFF, A, Sulla forma genera1e di un processo stocas'tioo

omogeneo. Rend. R. Accaa. Ltncei, (6), 15 (1932), 805-808 and 866869.

[4] KOLMOGOROFF, A. GrundbegriJfe der Wahrsch6inliehksitsreckn"Ung.Berlin, 1933..

[1] L..'\GBANGE, J. L. Memoire sur l'utilite de la methode de prendreIe milieu entre Iss resultats de plusieurs observa.tions.. Misc. Tauri ...nen8ia, 5 (1770-73), 167-232. (EfJIVt"U, 2, Paris, 1868.

[1] LAPLAOE, P .. S. Tne.O'rU analytique au probabilites. First ed.. 1812t

second ed. 1814. third ad. 1820.

118 BIBLIOGRAPHY

[1] LEBESGUE, H. Let;ona 8Ur l'integration. Second ed. Paris, 1928.[1] LEVY, P. Oalcti/, d88 probabilitbJ. Paris, 1925.[2] LEVY, P. Sur las integrales dont lea elements sont des variables

aleatoires independantes. Annali R. Sci. Norm. Sup. Pilla, (2),3 (1934), 337-366.

[3] LEVY, P. Proprietes asymptotiques des sommes de variablesaleatoires independantes ou enchainees. J ourn. .tvlatll. puflea apple(7), 14 (1935), 347-402.

[1] LIAPOUNOJT, A. Sur une proposition de la theorie des probabilites.Bull. Acad. Sci. St.peter8bourg, (0), 13 (1900), 359-386.

[2] LIAPOUNOFJr, A. Nouvelle forme du theoreme sur Is. limite deprobabilite. Mem. Acad. Sci. St.petersbourg, (8), 12 (1901), No.5.

[t] LmDEBERG, J. W. Eine neue Herleitung des Exponentialgesetzesin dar W&brscheinlichkeitsrechnung.. Mdth. Zeit8chrijt, 15 (1922),211-225.

[1] Lti'"NDBERG, F. 'Ober die Theorie der Ruckversicherung. T'erha11dl.6. intern. Kongr. Ver8.-Wis8., W4tm, 1909,1,877-956.

[2] LUNDBERG, F. FIJr8akringatelmisk riBkutjamning, 1-2. Stockholm,1926-28.

[1] v. :rtfISES, R. Fundamentalsatze der Wahrscheinlichkeitsrechnung..Math. Zsit8chriJt, 4 (1919), 1-97.

[2] v .. MISES, R. Grundlsgen der Wahrscheinlichkeitsrechnung. Math.Z6mchrijt,5 (1919), 52-99.

[3] v. MISES, R. WahracheiiUichkeitarechnung. Leipzig-Wien, 1931.[1] PEARSON, K. Historical note on the origin of the nonnal curve of

errors. Biometrika, 16 (1924), 402-404..[1] P6LYA, G. HerleitWlg des Gauss'schen Gesetzes &Us einer Funk·

tionalgleichung. Math. Zeit8chriJt, 18 (1923), 96-108.[1] RADON, J. Theorie und Anwendung der absolut additiven Mangen..

funktionen. Si'zung8ber. Akad. Wient 122 (1913), 1295-1438.[1] RIDER, P. R. A survey of the theory of small samples. AnnaZB of

Math. (2), 31 (1930), 577-628.[1] ROMANOVSltY, V.. Sur un theoreme limite du calcul des probabilitee..

Ree. Soc.. Math. MOBCa'U, 36 (1929), 36-64.[1] SLUTSKY, E. 'Ober stochastische Asymptoten und Grenzwerte.

Metron, 5 (1925), 1-90.[t] "STUDENT". The probable error of a mean. Biometrika, 6 (1908-9).

1-25.[t] THIELE, T .. N.. Theory of observations. London, 1903.[1] TITOHKARSH, E. C. The theory of functions. Oxford~ 1932.[1] TODHUNTER, I. A history oj the mathematical thv.wy oj probability.

Cambridge-London, 1865.[1] TOBNIEB, E. WahrscheinlichkeitBrechnung. Leipzig-Berlin, 1936.(1] DE LA VALLEE POUSSIN, C.. 1nfi,grale8 tk Lehesqua,jonctionad'BruJSmbleS.

Claa8e8 de Baire. Second ad. Paris, 1934"[1] WINTNER, A. On the addition of independent dIstributions. Amer.

Journ. MatI,. 56 (1934), 8-16.

BIBI.,IOGBAPRY 119

SOl\IE RECENT WORXS ON MA.THEMATIOAL PROBABILITY

BARTLET'£, M. S. An Introduction to Stochastic Proc6t!Joea. Ca.mbridge,1935.

BOCHNER, S. Harnwnic AnalY8U1 and tM Theory of Probability. Berkeley,1955.

DOOB, J. L .. Stochastic P-rOC688U. New York, 1953.ESSEEN, C. G. FO'IJIri,er analyBiB oj diBttriJ:nltion j'u/net-ions. Acta Maihe

matica, 77, 1945, 1-125.FELLER, W. An Introd,'lU)tion to Probability Theory and its AppZicoJiona,

Vol. I, 2nd ad. New York, 1957.GNEDENKO, B. V. and KOLMOGO:B.OFF, A. N. lAm" JJilMibutionl for

~~U'm8oj Indept/JuIent Random Variables.. (Translated from the Russianby K. L. Chung..) Cambridge, Mass., 1954.

LOEVE, :\-f. Probability Theory.. Fou:ndationa-BandfJ"m Sequences. NewYork.. 1955.

LUKACS, E. (/haracteristic Ji'Unction8. London, 1960.

Documents

Random Variables and Probability H Cramer (CUP 1962 125s)