Probability Slides

8/11/2019 Probability Slides

1/84

ProbabilityComputer Science Tripos, Part IA

R.J. Gibbens

Computer LaboratoryUniversity of Cambridge

Lent Term 2010/11

Last revision: 2010-12-17/r-43

1


2/84

Outline Elementary probability theory (2 lectures)

Probability spaces, random variables, discrete/continuousdistributions, means and variances, independence, conditionalprobabilities, Bayess theorem.

Probability generating functions (1 lecture)Denitions and properties; use in calculating moments of randomvariables and for nding the distribution of sums of independentrandom variables.

Multivariate distributions and independence (1 lecture)Random vectors and independence; joint and marginal densityfunctions; variance, covariance and correlation; conditional densityfunctions.

Elementary stochastic processes (2 lectures)Simple random walks; recurrence and transience; the GamblersRuin Problem and solution using difference equations.

2


3/84


4/84

Elementary probability theory

4


5/84


6/84

Event spacesWe formalize the notion of an event space , F , by requiring thefollowing to hold.

Denition (Event space)1. F is non-empty2. E F \E F 3. ( i I . E i F ) i I E i F

Example any set and F = P (), the power set of .

Example

any set with some event E and F = { / 0 , E , \E , }.Note that \E is often written using the shorthand E c for thecomplement of E with respect to .

6


7/84

Probability spacesGiven an experiment with outcomes in a sample space with anevent space F we associate probabilities to events by dening aprobability function P : F

R as follows.

Denition (Probability function)

1. E F . P (E ) 02. P() = 1 and P( / 0 ) = 0

3. E i F for i I disjoint (that is, E i E j = / 0 for i = j ) thenP ( i I E i ) =

i I P (E i ) .

We call the triple (, F , P ) a probability space .

7


8/84

Examples of probability spaces any set with some event E (E = / 0 , E = ) .Take F = { / 0 , E , \E , } as before and dene the probabilityfunction P(E )by

P (E ) =

0 E = / 0p E = E 1 p E = \E 1 E =

for any 0 p 1. = { 1 , 2 , . . . , n } with F = P () and probabilities given forall E F by

P (E ) = |E

|n . For a six-sided fair die = {1, 2, 3, 4, 5, 6} we take

P ({i }) = 16

.

8


9/84


10/84

Conditional probabilities

E 1 E 2

Given a probability space (, F , P ) andtwo events E 1 , E 2 F how doesknowledge that the random event E

2, say,

has occurred inuence the probabilitythat E 1 has also occurred?This question leads to the notion ofconditional probability .

Denition (Conditional probability)If P(E 2 ) > 0, dene the conditional probability , P(E 1|E 2 ), of E 1given E 2 by

P (E 1

|E 2 ) =

P(E 1 E 2 )P

(E

2 )

.

Note that P(E 2|E 2 ) = 1.Exercise: check that for any E F such that P(E ) > 0then (, F , Q ) is a probability space where Q : F R is dened byQ

(E ) =P

(E |E ) E F

.10


11/84

Independent eventsGiven a probability space (, F , P ) we can dene independencebetween random events as follows.

Denition (Independent events)Two events, E 1 , E 2 F are independent if

P (E 1 E 2 ) = P (E 1 )P (E 2 )

Otherwise, the events are dependent . Note that if E 1 and E 2 areindependent events then

P (E 1 |E 2 ) = P (E 1 )P (E 2 |E 1 ) = P (E 2 ) .

11


12/84

Independence of multiple eventsMore generally, a collection of events {E i |i I } are independentevents if for all subsets J of I

P ( j J E j ) = j J P (E j ) .

When this holds just for all those subsets J such that |J |= 2 we havepairwise independence .Note that pairwise independence does not imply independence(unless |I |= 2).

12


13/84

Venn diagramsJohn Venn 18341923

Example ( |I |= 3 events)E 1 , E 2 , E 3 are independent events if

P (E 1 E 2 ) = P (E 1 )P (E 2 )P (E 1 E 3 ) = P (E 1 )P (E 3 )P (E 2 E 3 ) = P (E 2 )P (E 3 )

P (E 1 E 2 E 3 ) = P (E 1 )P (E 2 )P (E 3 )13


14/84

Bayes theoremThomas Bayes (17021761)

Theorem (Bayes theorem)If E 1 and E 2 are two events with P(E 1 ) > 0 and P(E 2 ) > 0 then

P (E 1|E 2 ) = P(E 2|E 1 )P (E 1 )

P (E 2)

.

Proof.We have that

P (E 1|E 2 )P (E 2 ) = P (E 1 E 2 ) = P (E 2 E 1 ) = P (E 2|E 1 )P (E 1 ) .

Thus Bayes theorem provides a way to reverse the order ofconditioning.

14


15/84

Partitions

E 1 E 2

E 3 E 4

E 5

Given a probability space (, F , P ) dene apartition of as follows.

Denition (Partition)A partition of is a collection of disjointevents {E i F |i I } with

i I E i = .

We then have the following theorem (a.k.a. the law of totalprobability ).

Theorem (Partition theorem)If {E i F |i I } is a partition of and P(E i ) > 0 for all i I then

P (E ) = i I

P (E |E i )P (E i ) E F .

15


16/84

Proof of partition theoremWe prove the partition theorem as follows.

Proof.

P (E ) = P (E ( i I E i ))= P ( i I (E E i ))=

i I

P (E

E i )

= i I

P (E |E i )P (E i )

16


17/84

Bayes theorem and partitionsA (slight) generalization of Bayes theorem can be stated as followscombining Bayes theorem with the partition theorem.

P (E i |E ) = P(E |E i )P (E i )

j I P (E |E j )P (E j ) i I

where {E i F |i I } forms a partition of .

E 1

E 2 = \E 1As a special case consider thepartition {E 1 , E 2 = \E 1}.

Then we have

P (E 1|E ) = P(E |E 1 )P (E 1 )

P (E

|E 1 )P (E 1 ) + P (E

|

\E 1 )P (

\E 1 )

.

17


18/84

Bayes theorem exampleSuppose that you have a good game of tablefootball two times in three, otherwise a poorgame.Your chance of scoring a goal is 3/4 in a goodgame and 1/4 in a poor game.

What is your chance of scoring a goal in any given game?Conditional on having scored in a game, what is the chance that youhad a good game?So we know that

P(Good ) = 2/ 3,

P(Poor ) = 1/ 3, P(Score |Good ) = 3/ 4, P(Score |Poor ) = 1/ 4.

18

h l d


19/84

Bayes theorem example, ctdThus, noting that {Good , Poor } forms a partition of the sample spaceof outcomes,

P (Score ) = P (Score |Good )P (Good ) + P (Score |Poor )P (Poor )= ( 3/ 4) (2/ 3) + ( 1/ 4) (1/ 3) = 7/ 12 .

Then by Bayes theorem we have that

P (Good |Score ) = P(Score |Good )P (Good )P (Score ) = (3/ 4) (2/ 3)(7/ 12 ) = 6/ 7.

19


20/84


21/84

E l f di t di t ib ti


22/84

Examples of discrete distributions

Example (Bernoulli distribution)

k

P (X = k )

0

0.25

0 .5

0.75

1

0 1

e.g. p = 0.75

Here Im (X ) =

{0, 1

} and for given p [0, 1]

P (X = k ) =p k = 11 p k = 0 .

RV, X Parameters Im (X ) Mean VarianceBernoulli p [0, 1] {0, 1} p p (1 p )

22

E l f di t di t ib ti td


23/84

Examples of discrete distributions, ctd

Example (Binomial distribution, Bin (n , p ))

k

P (X = k )0 .3

0 .2

0 .1

00 1 2 3 4 5 6 7 8 9 10

e.g. n = 10 , p = 0.5

Here Im (X ) =

{0, 1, . . . , n

} for some positive

integer n and given p [0, 1]

P (X = k ) =n k

p k (1 p )n k k {0, 1, . . . , n }.

RV, X Parameters Im (X ) Mean VarianceBin(n , p ) n {1, 2, . . .} {0, 1, . . . , n } np np (1 p )p [0, 1]

We use the notationX Bin(n , p )

as a shorthand for the statement that the RV X is distributedaccording to stated Binomial distribution. We shall use this shorthand

notation for our other named distributions.23

Examples of discrete distributions ctd


24/84


Example (Geometric distribution, Geo (p ))

k

P (X = k )1 .0

0.75

0 .5

0.25

01 2 3 4 5 6 7 8 9 10

e.g. p = 0.75

Here Im (X ) =

{1, 2, . . .

} and 0 < p

1

P (X = k ) = p (1 p )k 1 k {1, 2, . . .}.

RV, X Parameters Im (X ) Mean VarianceGeo (p ) 0 < p 1 {1, 2, . . .} 1p 1p p 2

Notationally we writeX Geo (p ) .

Beware possible confusion : some authors prefer to dene our X 1as a Geometric RV!

24


25/84

Examples of discrete distributions ctd


26/84


Example (Poisson distribution, Pois ( ))

k

P (X = k )0 .4

0 .3

0 .2

0 .1

00 1 2 3 4 5 6 7 8 9 10

= 1

= 5

Here Im (X ) =

{0, 1, . . .

} and > 0

P (X = k ) = k e

k ! k {0, 1, . . .}.

RV, X Parameters Im (X ) Mean VariancePois ( ) > 0 {0, 1, . . .}

Notationally we write X Pois ( ) .

26

Expectation


27/84

ExpectationOne way to summarize the distribution of some RV, X , would be toconstruct a weighted average of the observed values, weighted bythe probabilities of actually observing these values. This is the idea ofexpectation dened as follows.

Denition (Expectation)The expectation , E(X ), of a discrete RV X is dened as

E (X ) = x Im(X )

x P (X = x )

so long as this sum is (absolutely) convergent (that is,x Im(X ) |x P (X = x )|< ).The expectation of a RV X is also known as the expected value , the

mean , the rst moment or simply the average .

27

Expectations and transformations


28/84

Expectations and transformationsSuppose that X is a discrete RV and g : R R is sometransformation. We can check that Y = g (X ) is again a RV denedby Y ( ) = g (X )( ) = g (X ( )) .TheoremWe have that

E (g (X )) = x

g (x )P (X = x )

whenever the sum is absolutely convergent.

Proof.

E (g (X )) = E (Y ) = y g (Im(X ))

y P (Y = y )

= y g (Im(X ))

y x Im(X ):g (x )= y

P (X = x )

= x Im(X )

g (x )P (X = x )

28


29/84

First and second moments of random variables


30/84

First and second moments of random variablesJust as the expectation or mean, E(X ), is called the rst moment ofthe RV X , E(X 2 ) is called the second moment of X .The variance Var (X ) = E (X

E (X )) 2 is called the second central

moment of X since it measures the dispersion in the values of X centred about their mean value.Note that we have the following property where a , b R .

Var (aX + b ) = E (aX + b

E (aX + b )) 2

= E (aX + b a E (X ) b )2= E a 2 (X E (X )) 2= a 2Var (X ) .

30

Calculating variances


31/84

Calculating variancesNote that we can expand our expression for the variance where againwe use = E (X ) as follows

Var (X ) = x (x )2 P (X = x )=

x (x 2 2 x + 2 )P (X = x )

= x

x 2 P (X = x ) 2 x x P (X = x ) + 2

x P (X = x )

= E (X 2 ) 2 2 + 2= E (X 2 ) 2= E (X 2 ) (E (X )) 2 .

This useful result determines the second central moment of a RV X interms of the rst and second moments of X . This usually is the bestmethod to calculate the variance.

31


32/84

Bivariate random variables


33/84

Given a probability space (, F , P ), we may have two RVs, X and Y ,say. We can then use a joint probability mass function

P (X = x , Y = y ) = P ({ |X ( ) = x }{ |Y ( ) = y })for all x , y R .We can recover the individual probability mass functions for X and Y as follows

P (X = x ) = P ({ |X ( ) = x })= P y Im(Y ) ({ |X ( ) = x }{ |Y ( ) = y })=

y Im(Y )P (X = x , Y = y ) .

Similarly,P (Y = y ) =

x Im(X )P (X = x , Y = y ) .

33

Transformations of random variables


34/84

If g : R 2 R then we get a similar result to that obtained in theunivariate caseE (g (X , Y )) = x Im(X ) y Im(Y ) g (x , y )P (X = x , Y = y ) .

This idea can be extended to probability mass functions in themultivariate case with three or more RVs.The linear transformation occurs frequently and is givenby g (x , y ) = ax + by + c where a , b , c R . In this case we nd that

E (aX + bY + c ) = x

y

(ax + by + c )P (X = x , Y = y )

= a x

x P (X = x ) + b y

y P (Y = y ) + c

= a E (X ) + b E (Y ) + c .

34

Independence of random variables


35/84

pWe have dened independence for events and can use the sameidea for pairs of RVs X and Y .

DenitionTwo RVs X and Y are independent if { |X ( ) = x }and { |Y ( ) = y } are independent for all x , y R .Thus, if X and Y are independent

P (X = x , Y = y ) = P (X = x )P (Y = y ) .

If X and Y are independent discrete RV with expected values E(X )and E(Y ) respectively then

E (XY ) = x

y

xy P (X = x , Y = y )

= x

y

xy P (X = x )P (Y = y )

= x

x P (X = x ) y

y P (Y = y )

= E (X )E (Y ) .

35

Variance of sums of RVs and Covariance


36/84

Given a pair of RVs X and Y consider the variance of their sum X + Y

Var (X + Y ) = E ((X + Y )

E (X + Y )) 2

= E ((X E (X ))+( Y E (Y ))) 2= E ((X E (X )) 2 ) + 2E ((X E (X ))( Y E (Y )))+

E ((Y E (Y )) 2 )= Var(X ) + 2Cov (X , Y ) + Var (Y )

where the covariance of X and Y is given by

Cov (X , Y ) = E ((X E (X ))( Y E (Y )))= E (XY )

E (X )E (Y ) .

So, if X and Y are independent RV then E(XY ) = E (X )E (Y ) andso Cov (X , Y ) = 0 and we have that

Var (X + Y ) = Var(X ) + Var (Y ) .

Notice also that if Y = X then Cov (X , X ) = Var(X ). 36


37/84


38/84

Distribution functions


39/84

Given a probability space (, F , P ) we have so far considereddiscrete RVs that can take a countable number of values. Moregenerally, we dene X :

R as a random variable if

{ |X ( ) x } F x R .Note that a discrete random variable, X , is a random variable since

{

|X ( )

x

}= x Im(X ):x

x

{

|X ( ) = x

}F .

Denition (Distribution function)If X is a RV then the distribution function of X , written F X (x ), isdened by

F X (x ) = P ({ |X ( ) x }) = P (X x ) .

39

Properties of the distribution function


40/84

F X (x ) = P (X x )

x

F X (x )

0

1

1. If x y then F X (x ) F X (y ).2. If x then F X (x ) 0.3. If x

then F X (x )

1.

4. If a < b then P(a < X b ) = F X (b ) F X (a ).

40

Continuous random variables


41/84

Random variables that take just a countable number of values arecalled discrete . More generally, we have that a RV can be dened byits distribution function , F X (x ). A RV is said to be a continuousrandom variable when the distribution function has sufcientsmoothness that

F X (x ) = P (X x ) = x

f X (u )du

for some function f X (x ). We can then take

f X (x ) =dF X (x )

dx if the derivative exists at x 0 otherwise .

The function f X (x ) is called the probability density function of thecontinuous RV X or often just the density of X .The density function for continuous RVs plays the analogous rle tothe probability mass function for discrete RVs.

41

Properties of the density function


42/84

x

f X (x )

0

1. x R . f X (x ) 0.2.

f X (x )dx = 1.3. If a b then P(a X b ) =

b a f X (x )dx .

42

Examples of continuous random variables


43/84

We dene some common continuous RVs, X , by their densityfunctions, f X (x ).

Example (Uniform distribution, U (a

,b

))

x

f X (x )

0 a b

1(b a )

Given a R and b R with a < b then

f X (x ) = 1

(b a ) if a < x < b 0 otherwise .

RV, X Parameters Im (X ) Mean VarianceU (a , b ) a , b R (a , b ) a + b 2

(b a )212a < b

Notationally we writeX U (a , b ) .

43

Examples of continuous random variables, ctd


44/84

Example (Exponential distribution, Exp ( ))

x

f X (x )

0

1

= 1

Given > 0 then

f X (x ) = e x if x > 00 otherwise .

RV, X Parameters Im (X ) Mean VarianceExp ( ) > 0 R+ 1

1 2

Notationally we write X Exp ( ) .

44

Examples of continuous random variables, ctd


45/84

Example (Normal distribution, N ( , 2 ))

x

f X (x )

Given R and 2 > 0 thenf X (x ) =

1 2 2 e

(x )2 / (2 2 ) < x <

RV, X Parameters Im(X

) Mean Variance

N ( , 2 ) R R 2 2 > 0

Notationally we write

X N ( , 2

) .

45

Expectations of continuous random variables


46/84

Just as for discrete RVs we can dene the expectation of acontinuous RV with density function f X (x ) by a weighted averaging.

Denition (Expectation)The expectation of X is given by

E (X ) =

xf X (x )dx

whenever the integral exists.In a similar way to the discrete case we have that if g : R R then

E (g (X )) =

g (x )f X (x )dx

whenever the integral exists.

46

Variances of continuous random variables


47/84

Similarly, we can dene the variance of a continuous RV X .

Denition (Variance)

The variance , Var (X ), of a continuous RV X with densityfunction f X (x ) is dened as

Var (X ) = E (X E (X )) 2 =

(x )2 f X (x )dx

whenever the integral exists and where = E (X ).

Exercise: check that we again nd the useful result connecting thesecond central moment to the rst and second moments.

Var (X ) = E (X 2 ) (E (X )) 2 .

47

Example: exponential distribution, Exp ( )


48/84

Suppose that the RV X has an exponential distribution withparameter > 0 then using integration by parts

E (X ) =

0 x e x dx

= xe x

0+

0e x dx

= 0 + 1

0

e x dx = 1

and

E (X 2 ) =

0x 2 e x dx

= x 2e x

0 +

0 2xe x dx

= 0 + 2

0x e x dx = 2

2 .

Hence, Var (X ) = E (X 2 )

(E (X )) 2 = 2

2

( 1

)2 = 1

2 .

48

Bivariate continuous random variablesGi b bili ( F ) h l i l i


49/84

Given a probability space (, F , P ), we may have multiple continuousRVs, X and Y , say.

Denition (joint probability distribution function)The joint probability distribution function is given by

F X ,Y (x , y ) = P ({ |X ( ) x }{ |Y ( ) y })= P (X x , Y y )

for all x , y R .Independence follows in a similar way to the discrete case and wesay that two continuous RVs X and Y are independent if

F X ,Y (x , y ) = F X (x )F Y (y )

for all x , y R .

49

Bivariate density functionsTh bi i d i f i RV X d Y i


50/84

The bivariate density of two continuous RVs X and Y satises

F X ,Y (x , y ) =

x

u = y

v = f X ,Y (u , v )dudv

and is given by

f X ,Y (x , y ) = 2 x y F X ,Y (x , y ) if the derivative exists at (x , y )0 otherwise .

We have thatf X ,Y (x , y ) 0 x , y R

and that

f X ,Y

(x , y )dxdy = 1 .

50

Marginal densities and independenceIf X d Y h j i t d it f ti f ( ) th h


51/84

If X and Y have a joint density function f X ,Y (x , y ) then we havemarginal densities

f X (x ) =

v = f X ,Y (x , v )dv

andf Y (y ) =

u = f X ,Y (u , y )du .

In the case that X and Y are also independent then

f X ,Y (x , y ) = f X (x )f Y (y )

for all x , y R .

51

Conditional density functionsThe marginal density f (y ) tells us about the variation of the RV Y


52/84

The marginal density f Y (y ) tells us about the variation of the RV Y when we have no information about the RV X . Consider the oppositeextreme when we have full information about X , namely, that X = x ,

say. We can not evaluate an expression like

P (Y y |X = x )directly since for a continuous RV P(X = x ) = 0 and our denition ofconditional probability does not apply.Instead, we rst evaluate P(Y y |x X x + x ) for any x > 0.We nd that

P (Y y |x X x + x ) = P(Y y , x X x + x )

P (x X x + x )= x + x u = x y v = f X ,Y (u , v )dudv

x + x u = x f X (u )du

.

52

Conditional density functions, ctdNow divide the numerator and denominator by x and take the limit


53/84

Now divide the numerator and denominator by x and take the limitas x 0 to give

P (Y y |x X x + x ) y

v = f X ,Y (x , v )

f X (x ) dv = G (y ), say

where G (y ) is a distribution function with corresponding density

g (y ) = f X ,Y (x , y )

f X (x ) .Accordingly, we dene the notion of a conditional density function asfollows.

Denition

The conditional density function of Y given X = x is dened as

f Y |X (y |x ) = f X ,Y (x , y )

f X (x )

dened for all y R and x R such that f X

(x ) > 0.53


54/84

Probability generating functions

54

Probability generating functionsA very common situation is when a RV X can take only non-negative


55/84

A very common situation is when a RV, X , can take only non-negativeinteger values, that is Im (X ) {0, 1, 2, . . .}. The probability massfunction, P(X = k ), is given by a sequence of values p 0 , p 1 , p 2 , . . .where p k = P (X = k ) k {0, 1, 2, . . .}and we have that

p k 0 k {0, 1, 2, . . .} and

k = 0

p k = 1 .

The terms of this sequence can be wrapped together to dene acertain function called the probability generating function (PGF).

Denition (Probability generating function)The probability generating function , G X (z ), of a (non-negative

integer-valued) RV X is dened as

G X (z ) =

k = 0

p k z k

for all values of z such that the sum converges appropriately.55

Elementary properties of the PGF1 GX (z ) = k 0 pk z

k so


56/84

1. G X (z ) k = 0 p k z so

G X (0) = p 0 and G X (1) = 1 .

2. If g (t ) = z t then

G X (z ) =

k = 0

p k z k =

k = 0

g (k )P (X = k ) = E (g (X )) = E (z X ) .

3. The PGF is dened for all |z | 1 since

k = 0 |p k z k |

k = 0

p k = 1 .

4. Importantly, the PGF characterizes the distribution of a RV in the

sense thatG X (z ) = G Y (z ) z

if and only if

P (X = k ) = P (Y = k ) k {0, 1, 2, . . .}.56

Examples of PGFs


57/84

Example (Bernoulli distribution)

G X

(z ) = q + pz where q = 1

p .

Example (Binomial distribution, Bin (n , p ))

G X (z ) =n

k = 0n k p k (q )n k z k = ( q + pz )n where q = 1 p .

Example (Geometric distribution, Geo (p ))

G X (z ) =

k = 1

pq k 1z k = pz

k = 0

(qz )k = pz 1 qz

if |z |< q 1 and q = 1 p .

57

Examples of PGFs, ctd

E l (U if di ib i U(1 ))


58/84

Example (Uniform distribution, U (1, n ))

G X (z ) =n

k = 1 z k 1n =

z n

n 1k = 0 z

k = z n

(1

z n )

(1 z ) .

Example (Poisson distribution, Pois ( ))

G X (z ) =

k = 0

k e k ! z k = e z e = e (z 1) .

58


59/84

Further derivatives of the PGFTaking the second derivative gives


60/84

g g

G X (z ) =

k = 0

k (k

1)z k 2 P (X = k ) .

So that,

G X (1) =

k = 0

k (k 1)P (X = k ) = E (X (X 1))

Generally, we have the following result.TheoremIf the RV X has PGF G X (z ) then the r-th derivative of the PGF,written G (r )X (z ), evaluated at z = 1 is such that

G (r )X (1) = E (X (X 1) (X r + 1)) .

60


61/84

Sums of independent random variablesThe following theorem shows how PGFs can be used to nd the PGF


62/84

of the sum of independent RVs.

TheoremIf X and Y are independent RVs with PGFs G X (z ) and G Y (z )respectively then

G X + Y (z ) = G X (z )G Y (z ) .

Proof.Using the independence of X and Y we have that

G X + Y (z ) = E (z X + Y )

= E (z X z Y )

= E (z X

)E (z Y

)= G X (z )G Y (z )

62

PGF example: Poisson RVsFor example, suppose that X and Y are independent RVs


63/84

with X Pois ( 1 ) and Y Pois ( 2 ), respectively.ThenG X + Y (z ) = G X (z )G Y (z )

= e 1 (z 1)e 2 (z 1)

= e ( 1 + 2 )( z 1) .

Hence X + Y Pois ( 1 + 2 ) is again a Poisson RV but with theparameter 1 + 2 .

63

PGF example: Uniform RVsConsider the case of two fair dice with IID


64/84

outcomes X and Y , respectively, so that X U (1, 6)and Y U (1, 6). Let the total score be Z = X + Y and consider the probability generating functionof Z given by G Z (z ) = G X (z )G Y (z ). Then

G Z (z ) = 16

(z + z 2 + + z 6 )16

(z + z 2 + + z 6 )

= 136 [z

2+ 2z

3+ 3z

4+ 4z

5+ 5z

6+ 6z

7+

5z 8 + 4z 9 + 3z 10 + 2z 11 + z 12 ].

k

P (X = k )16

1 2 3 4 5 6k

P (Y = k )16

1 2 3 4 5 6k

P (Z = k )16 =

636

336

2 3 4 5 6 7 8 9 101112

64


65/84

Elementary stochastic processes

65

Random walksConsider a sequence Y 1 , Y 2 , . . . of independent and identicallyd b d ( ) h ( ) d ( )


66/84

distributed (IID) RVs with P(Y i = 1) = p and P(Y i = 1) = 1 p with p [0, 1].Denition (Simple random walk)The simple random walk is a sequence of RVs {X n |n {1, 2, . . .}}dened by

X n = X 0 + Y 1 + Y 2 + + Y n where X 0

R is the starting value.

Denition (Simple symmetric random walk)A simple symmetric random walk is a simple random walk with thechoice p = 1/ 2.

n

X n

0

X 0

1 2 3 4 5 6 7 8 9

E.g. X 0 = 2 & (Y 1 , Y 2 , . . . , Y 9 , . . .) = ( 1 ,1,1 ,1,1, 1, 1 , 1,1 , . . .)

66

ExamplesPractical examples of random walks abound

h h i l i ( i f i


67/84

across the physical sciences (motion of atomicparticles) and the non-physical sciences

(epidemics, gambling, asset prices).

The following is a simple model for the operation of a casino.Suppose that a gambler enters with a capital of X 0 . At each stagethe gambler places a stake of 1 and with probability p wins thegamble otherwise the stake is lost. If the gambler wins the stake isreturned together with an additional sum of 1.Thus at each stage the gamblers capital increases by 1 withprobability p or decreases by 1 with probability 1 p .The gamblers capital X n at stage n thus follows a simple randomwalk except that the gambler is bankrupt if X n reaches 0 and thencan not continue to any further stages.

67

Returning to the starting state for a simple randomwalk


68/84

Let X n be a simple random walk and

r n = P (X n = X 0 ) for n = 1, 2, . . .

the probability of returning to the starting state at time n .We will show the following theorem.

Theorem

If n is odd then r n = 0 else if n = 2m is even then

r 2m =2m m

p m (1 p )m .

68

Proof.The position of the random walk will change by an amount


69/84

X n X 0 = Y 1 + Y 2 + + Y n between times 0 and n . Hence, for this change X n X 0 to be 0 theremust be an equal number of up steps as down steps. This can neverhappen if n is odd and so r n = 0 in this case. If n = 2m is even thennote that the number of up steps in a total of n steps is a binomial RVwith parameters 2 m and p . Thus,

r 2m = P (X n X 0 = 0) =2m m

p m (1 p )m .

This result tells us about the probability of returning to the startingstate at a given time n .We will now look at the probability that we ever return to our startingstate. For convenience, and without loss of generality, we shall takeour starting value as X 0 = 0 from now on.

69

Recurrence and transience of simple randomwalks


70/84

Note rst that E(Y i ) = p (1 p ) = 2p 1 for each i {1, 2, . . .}. Thusthere is a net drift upwards if p > 1/ 2 and a net drift downwardsif p < 1/ 2. Only in the case p = 1/ 2 is there no net drift upwards nordownwards.We say that the simple random walk is recurrent if it is certain torevisit its starting state at some time in the future and transientotherwise.

We shall prove the following theorem.TheoremFor a simple random walk with starting state X 0 = 0 the probability of revisiting the starting state is

P (X n = 0 for some n {1, 2, . . .}) = 1 |2p 1|.Thus a simple random walk is recurrent only when p = 1/ 2.

70


71/84

Proof, ctdDene generating functions for the sequences r n and f n by


72/84

R (z ) =

n = 0

r n z n and F (z ) =

n = 0

f n z n

where r 0 = 1 and f 0 = 0 and take |z |< 1. We have that

n = 1

r n z n =

n = 1

n

m = 1

f m r n m z n

=

m = 1

n = m

f m z m r n m z n m

=

m = 1

f m z m

k = 0

r k z k

= F (z )R (z ) .

The left hand side is R (z ) r 0z 0 = R (z ) 1 thus we have thatR (z ) = R (z )F (z ) + 1 if |z |< 1.

72

Proof, ctdNow,


73/84

R (z ) =

n = 0

r n z n

=

m = 0

r 2m z 2m as r n = 0 if n is odd

=

m = 0

2m m

(p (1 p )z 2 )m

= ( 1 4p (1 p )z 2 )12 .

The last step follows from the binomial series expansionof (1 4 )

12 and the choice = p (1 p )z 2 .

Hence, F (z ) = 1 (1 4p (1 p )z 2 )12 for |z |< 1 .

73


74/84

Mean return timeConsider the recurrent case when p = 1/ 2 and set


75/84

T = min{n 1 |X n = 0} so that P(T = n ) = f n where T is the time of the rst return to the starting state. Then

E (T ) =

n = 1

nf n

= G T (1)

where G T (z ) is the PGF of the RV T and for p = 1/ 2 we havethat 4 p (1 p ) = 1 so

G T (z ) = 1 (1 z 2 )12

so thatG T (z ) = z (1 z 2 )

12 as z 1 .

Thus, the simple symmetric random walk ( p = 1/ 2) is recurrent butthe expected time to rst return to the starting state is innite .

75

The Gamblers ruin problemWe now consider a variant of the simple random walk. Consider twoplayers A and B with a joint capital between them of N. Suppose that


76/84

p y j p ppinitially A has X 0 = a (0 a N ).At each time step player B gives A 1 with probability p and withprobability q = ( 1 p ) player A gives 1 to B instead. The outcomesat each time step are independent.The game ends at the rst time T a if either X T a = 0 or X T a = N forsome T a {0, 1, . . .}.We can think of As wealth, X n , at time n as a simple random walk onthe states {0, 1, . . . , N } with absorbing barriers at 0 and N .Dene the probability of ruin for gambler A as

a = P (A is ruined ) = P (B wins ) for 0 a N .

n

X n

0

N = 5

X 0 = 2

1 2 3 4 5 6 7 8 9

E.g. N = 5, X 0 = a = 2 & (Y 1 , Y 2 , Y 3 , Y 4 ) = ( 1,

1,

1,

1)

T 2 = 4 & X T 2 = X 4 = 0

76

Solution of the Gamblers ruin problemTheoremThe probability of ruin when A starts with an initial capital of a is given


77/84

The probability of ruin when A starts with an initial capital of a is given by

a =

a

N

1 N if p = q 1 a N if p = q = 1/ 2where = q / p.For illustration here is a set of graphs of a for N = 100 and three

possible choices of p .

a

a

10 20 30 40 50 60 70 80 90 1000

1

0.75

0 .5

0.25

p = 0.49

p = 0.5

p = 0.51

77

ProofConsider what happens at the rst time step


78/84

a = P (ruin Y 1 = + 1|X 0 = a ) + P (ruin Y 1 = 1|X 0 = a )= p P (ruin |X 0 = a + 1) + q P (ruin |X 0 = a 1)= p a + 1 + q a 1

Now look for a solution to this difference equation of the form a withboundary conditions 0 = 1 and N = 0.Try a solution of the form a = a to give

a = p a + 1 + q a 1

Hence,p 2 + q = 0

with solutions = 1 and = q / p .

78

Proof, ctdIf p = q there are two distinct solutions and the general solution of thedifference equation is of the form A + B (q / p )a .


79/84

Applying the boundary conditions

1 = 0 = A + B and 0 = N = A + B (q / p )N

we getA = B (q / p )N

and1 = B B (q / p )N

so

B = 1

1 (q / p )N and A = (q / p )N

1 (q / p )N .

Hence,a =

(q / p )a (q / p )N 1 (q / p )N

.

79

Proof, ctdIf p = q = 1/ 2 then the general solution is C + Da .So with the boundary conditions


80/84

1 = 0 = C + D (0) and 0 = N = C + D (N ) .

Therefore,C = 1 and 0 = 1 + D (N )

soD =

1/ N

anda = 1 a / N .

80

Mean duration timeSet T a as the time to be absorbed at either 0 or N starting from theinitial state a and write a = E (T a ).


81/84

Then, conditioning on the rst step as before

a = 1 + p a + 1 + q a 1 for 1 a N 1and 0 = N = 0.It can be shown that a is given by

a = 1p

q N

(q / p )a 1(q / p )N

1 a if p = q

a (N a ) if p = q = 1/ 2 .We skip the proof here but note the following cases can be used toestablish the result.Case p = q : trying a particular solution of the form a = ca showsthat c = 1/ (q

p ) and the general solution is then of the

form a = A + B (q / p )a + a / (q p ). Fixing the boundary conditionsgives the result.Case p = q = 1/ 2: now the particular solution is a 2 so the generalsolution is of the form a = A + Ba a 2 and xing the boundaryconditions gives the result.

81

Properties of discrete RVsRV, X Parameters Im (X ) P(X = k ) E(X ) Var(X ) G X (z )

Bernoulli p [0, 1]

{0 , 1} (1 p ) if k = 0 or p if k = 1 p p (1 p ) (1 p + pz )Bi ( ) {1 2 } {0 1 } n k (1 )n k (1 ) (1 )n


82/84

Bin(n , p ) n {1, 2 , . . .} {0 , 1, . . . , n } n k p k (1 p )n k np np (1 p ) (1 p + pz )n p [0, 1]Geo (p ) 0 < p

1

{1 , 2, . . .

} p (1

p )k 1 1p 1p

p 2

pz 1

(1

p )z

U (1 , n ) n {1, 2 , . . .} {1 , 2, . . . , n } 1n n + 12 n 2112 z (1z

n )n (1z )

Pois ( ) > 0 {0 , 1, . . .} k e

k ! e (z 1)

82

Properties of continuous RVsRV, X Parameters Im (X ) f X (x ) E(X ) Var(X )

1 a+ b (b a )2


83/84

U (a , b ) a , b R (a , b ) 1b a a + b

2(b a )12

a < b Exp ( ) > 0 R+ e x 1 1 2N ( , 2 ) R R 1 2 2 e (x )

2 / (2 2 ) 2

2 > 0

83

Notation sample space of possible outcomes F event space : set of random events E I(E ) indicator function of the event E F


84/84

I(E ) indicator function of the event E F P (E ) probability that event E occurs, e.g. E ={

X = k

}RV random variableX U (0, 1) RV X has the distribution U (0, 1)P (X = k ) probability mass function of RV X F X (x ) distribution function , F X (x ) = P (X x )f X (x ) density of RV X given, when it exists, by F X (x )

PGF probability generating function G X (z ) for RV X E (X ) expected value of RV X

E (X n ) n th moment of RV X , for n = 1, 2, . . .Var (X ) variance of RV X

IID independent, identically distributed

X n sample mean of random sample X 1 , X 2 , . . . , X n

84

Documents

Probability Slides