Tutorial 5 So Ln

7/23/2019 Tutorial 5 So Ln

http://slidepdf.com/reader/full/tutorial-5-so-ln 1/10

Australian School of Business

Probability and StatisticsSolutions Week 5

1. We are given that X ∼ Exp(1/5000). Thus, E [X ] = 5000 and V ar (X ) = (5000)2 . Let S = X 1 + . . . +

X 100. Then E [S ] = 100(5000) = 500, 000 and V ar (S ) = 100 (5000)2 .Thus, using the central limit

theorem, we have:

Pr (S > 100 (5050)) = Pr

S − E (S )

V ar (S )>

100 (50)

10 (5000)

≈ Pr (Z > 0.10) = 1− 0.5398 = 0.4602.

2. To find an estimator for θ using the method of moments, let E [X ] = X . We then have:

X = E [X ] =

∞−∞

f X(x)dx

=

θ 0

x2 (θ − x)

θ2 dx

= 2

θ2

θ 0

θx − x2

dx

= 2

θ2 θ

x2

2 − x3

3 θ

0

= 2

θ2

θ

θ2

2 − θ3

3

=

θ

3.

Hence, the method moments estimate is: θ = 3X.

3. To prove X n p→ µ in probability, we show that if we take any ε > 0, we must have:

PrX n − µ

> ε→ 0, as n →∞

or, equivalently;limn→∞

Pr X n−

µ > ε = 0.

First, note that we have:

E

X n

= µ and V ar

X n

= 1

n2

nk=1

σ2k.

Applying the Chebyshev’s inequality:

PrX n − µ

> ε ≤ 1

ε2

1

n2

nk=1

σ2k.

And take the limits on both sides:

limn→∞

Pr X n

−µ > ε ≤ lim

n→∞

1

ε2

·

1

n2

·

n

k=1

σ2k

= 1

ε2 · limn→∞

1

n2

nk=1

σ2k

=0

= 0.

c Katja Ignatieva School of Risk and Actuarial Studies, ASB, UNSW Page 1 of 10



ACTL2002 & ACTL5101 Probability and Statistics Solutions Week 5

Thus, the result follows.

4. Let L be the location after one hour (or 60 minutes). Therefore:

L = X 1 + . . . + X 60,

where

X k =

50 cm, w.p. 1

2−50 cm, w.p. 1

2 ,

so that E [X k] = 0 and V ar (X k) = 2500.Therefore,

E [S ] = 0 and V ar (S ) = 60 · (2500) = 150000.

Thus, using the central limit theorem, we have:

Pr (L ≤ x) = Pr

L− E [L]

V ar (L)≤ x√

150000

≈ Pr

Z ≤ x

100√

15

.

In other words,L ∼ N (0, 150000)

approximately. The mean of a normal is also the mode, therefore its most likely position after onehour is 0, the point where he started with.

5. Consider N independent random variables each having a binomial distribution with parameters n = 3and θ so that Pr(X i = k) = 3kθk (1 − θ)

n−k, for i = 1, 2, . . . , N and k = 0, 1, 2, 3. Assume that of

these N random variables n0 take the value 0, n1 take the value 1, n2 take the value 2, and n3 takethe value 3 with N = n0 + n1 + n2 + n3.

(a) The likelihood function is given by:

L (θ; x) =ni=1

f X(xi)

=

3

0

(1− θ)3

n0·

3

1

θ (1 − θ)2

n1·

3

2

θ2 (1− θ)

n2·

3

3

θ3

n3.

The log-likelihood function is given by:

ℓ (θ; x) =log (L(θ; x)) =

ni=1

log(f X(xi))

∗=n0 ·

log

3

0

+ 3log(1− θ)

+ n1 ·

log

3

1

+ log(θ) + 2log(1− θ)

+ n2 ·

log

3

2

+ 2 log(θ) + log (1− θ)

+ n3 ·

log

3

3

+ 3 log(θ)

,

* using log(a · b) = log(a) + log(b) and log(ac · b) = c log(a) + log(b)Then, take the FOC of ℓ (θ; x):

∂ℓ (θ; x)

∂θ = − 3n0

(1− θ) +

n1

θ − 2n1

(1− θ) +

2n2

θ − n2

(1 − θ) +

3n3

θ

= n

1 + 2n

2 + 3n

3θ −

3n0

+ 2n1

+ n2

(1− θ) .

Equating this to zero we obtain:

n1 + 2n2 + 3n3

θ − 3n0 + 2n1 + n2

(1 − θ) = 0,

or, equivalently:(n1 + 2n2 + 3n3) (1− θ) = (3n0 + 2n1 + n2) θ.

Thus we have the maximum likelihood estimator for θ is: θ ∗=

(n1 + 2n2 + 3n3)

(n1 + 2n2 + 3n3) + (3n0 + 2n1 + n2)

= (n1 + 2n2 + 3n3)

(3n0 + 3n1 + 3n2 + 3n3)

= (n1 + 2n2 + 3n3)

3N ,

* using: a1−a = bc → 1

1/a−1 = bc → 1a − 1 = cb → 1

a = c+bb → a = bb+c .





(b) We have:

N = 20, n0 = 11, n1 = 7, n2 = 2, n3 = 0.

Thus the ML estimate for θ is given by:

θ =

(n1 + 2n2 + 3n3)

3N

= 7 + 460

= 1160

= 0.1833.

Thus, the probability of winning any single bet is given by 0.1833.

6. (a) The likelihood function is given by:

L(α; y, A) =ni=1

f Y (yi) =ni=1

αAα

yα+1i

=

ni=1 αAα

ni=1 yα+1i

= αnAnα

(ni=1 yi)α+1

= αnAnαni=1 y1/ni

n(α+1)

= αnAnα

Gn(α+1)

(b) In the lecture we have seen that:

π(α|y; A) =f Θ|Y (α|y; A)

∗=f Y

|Θ(y

|α; A)π(α) ∞−∞ f Y |Θ(y|α; A)π(α)dα

∗∗=

f Y |Θ(y|α; A)π(α)

f Y |Θ(y; A)∗∗∗∝ f Y |Θ(y|α; A)π(α)

* using Bayes formulae: Pr(Ai|B) = Pr(B|Ai)·Pr(Ai)j Pr(B|Aj)·Pr(Aj) , where the set Ai(= π(α)) i = 1, . . . , n is

a complete partition of the sample space.** using the law of total probability: Pr(A) = Pr(A|Bi) · Pr(Bi) if Bi(= π (α)) i = 1, . . . , n is acomplete partition of the sample space.*** using that f Y |Θ(y; A) is, given the data, a known constant.





(c) We have that the posterior density is given by:

π(α|y; A) =f Θ|Y (α|y; A)

∝f Y |Θ(y|α; A)π(α)

∗=π(α)

ni=1

f Y (yi; A)

=L(α; y, A) · π(α)

∝L(α; y, A) · 1

α

=αn−1Anα

Gn(α+1)

=αn−1 ·

A

G

nα· 1

Gn

=αn−1 ·

G

A

−nα· 1

Gn

∗∗∝αn−1 ·

G

A

−nα=αn−1 · exp

logG

A

−nα=αn−1 · exp

−nα · log

G

A

=αn−1 · exp(−nαa)

* using independence between all f Y |Θ(yi|α; A) and f Y |Θ(yj |α; A) for i = j

* using 1(Gn)

is a known constant.

(d) We have that π(α|y; A) ∝ αn−1 · exp(−nαa) or, equivalently, there exist some constant c ∈ ℜ for

which π(α|y; A) = c · αn−1 · exp(−nαa). we need to determine the constant c. We know that

∞−∞ π(α|y; A)dα = 1, because otherwise it is not a posterior density.

Given this observation, we are going to compare c·αn−1·exp(−nαa) with the p.d.f. of X ∼Gamma(αx, λx),which is given by:

f X(x) = λαxxΓ(αx)

· xαx−1 · e−λxx.

Now, substitute x = α, αx = n, λx = an, and c = 1Γ(αx) = 1

Γ(n) . Then we have the density of a

Gamma(n,an) distribution. Hence, the posterior density is given by:

π(α|y; A) = (an)n

Γ(n) · αn−1 · e−anα, for 0 < α < ∞,

and zero otherwise.

(e) The Bayesian estimator of α is the expected value of the posterior. The posterior has a Z ∼Gamma(n,an)distribution. We have that E [Z ] = nna . Thus:

αB = E

π(α|y; A)

= n

na =

1

a.

Thus the Bayesian estimator of α is 1a .

7. We use moment generating function to show that:

(a) The binomial tends to the Poisson: Let X ∼ Binomial(n, p). Its m.g.f. is therefore:

M X (t) =

1− p + petn

let np = λ so that p = λ/n

=

1− λn

+ λn

etn

=

1 +

λ (et − 1)

n

nc Katja Ignatieva School of Risk and Actuarial Studies, ASB, UNSW Page 4 of 10




and by taking limit on both sides, we have:

limn→∞

M X (t) = limn→∞

1 +

λ (et − 1)

n

n= exp

λ

et − 1

,

which is the moment generating function of a Poisson with mean λ.

(b) The gamma, properly standardized, tends to Normal: Let X ∼ Gamma(α, β ) so that its densityis of the form:

f (x) = β

α

Γ (α) xα−1e−βx, for x ≥ 0,

and zero otherwise, and its m.g.f. is:

M X (t) =

β

β − t

α.

Its mean and variance are, respectively, α/β and α/β 2. These results have been derived in lectureweek 2. Consider the standardized Gamma random variable:

Y = X −E (X )

V ar (X )=

X − α/β α/β 2

= βX − α√

α =

βX √ α − √ α

Its moment generating function is:

M Y (t) = e−√ αtE

eβX√ αt

= e−√ αtM X

βt√ α

= e−

√ αt

β

β − (βt/√

α)

α= e−

√ αte−α log(1−(t/

√ α))

= exp

−√ αt− α

− t/

√ α− 1

2

t/√

α2

+ R

here R is the Taylor’s series remainder term

= exp

1

2t2 + R

′

,

where R′

involves powers of 1/√

α.. Thus in the limit, M Y (t) → exp

12

t2

as α →∞.

8. If the law of large numbers were to hold here, it would have had the sample mean X approaching themean of X , which does not exist in this case. At first glance therefore it would seem not a violation.But, in fact, it is, because the assumption of finite mean does not hold for Cauchy and therefore thelaw of large numbers cannot hold.

9. Given that there are n realizations of xi,where i = 1, 2, . . . , n. We know that xi| p ∼ Ber( p) and p ∼ U (0, 1). We are asked to find the Bayesian estimators for p and p(1− p). Since n random variablesare independent, then:

f (x1, x2, . . . , xn| p) =

ni=1

f (xi| p)

= pni=1 xi(1

− p)n−

ni=1 xi

Since xi’s are independent with random variable p, then

f (x1, x2, . . . , xn, p) = pn

i=1 xi(1− p)n−n

i=1 xi .

Then we can compute the joint density for xi where i = 1, 2, . . . , n,

f (x1, x2, . . . , xn) =

1

0

pn

i=1 xi(1− p)n−n

i=1 xidp

= Γ(ni=1 xi + 1)Γ(n−ni=1 xi + 1)

Γ(n + 2) .

(a) Method 1: Hence we can obtain the posterior function:

f ( p|x1, x2, . . . , xn) = f (x1, x2, . . . , xn, p)f (x1, x2, . . . , xn)

= Γ(n + 2)

Γ(ni=1 xi + 1)Γ(n + 1 −ni=1 xi)

pn

i=1 xi(1− p)n−n

i=1 xi ,





which is the probability density function for:Beta((

ni=1 xi + 1) , (n + 1 −ni=1 xi)). Method 2: Observe that the difference between f (x1, x2, . . . , xn)

and the p.d.f. in of a Beta distribution are proportional to each other and use this to find thedistribution of f ( p|x1, x2, . . . , xn).Hence, we have f ( p|x1, x2, . . . , xn) ∝ f Y (x),where Y ∼Beta((

ni=1 xi + 1) , (n + 1 −ni=1 xi)).

The Bayesian estimator for p will thus be:

pB = E [ p|X ] = ni=1 xi + 1n + 2

.

(See Formulae and Tables page 13).

(b) Now we wish to find a Bayesian estimator for p(1− p). Then using the similar idea:

( p(1− p))B

=E [ p(1− p)|X ]

=

1

0

p(1− p)f ( p|x1, x2, . . . , xn)dp

= Γ(n + 2)

Γ(ni=1 xi + 1)Γ(n + 1

−ni=1 xi)

1

0

p1+n

i=1 xi(1− p)n+1−ni=1 xidp

∗=

Γ(n + 2)Γ(ni=1 xi + 1)Γ(n + 1 −ni=1 xi)

Γ(ni=1 xi + 2)Γ(n−ni=1 xi + 2)Γ(n + 4)

∗∗=

Γ(n + 2)

Γ(ni=1 xi + 1)Γ(n + 1 −ni=1 xi)

·

((ni=1 xi + 1) · Γ(

ni=1 xi + 1)) · ((n−ni=1 xi + 1) · Γ(n−ni=1 xi + 1))

(n + 3) · (n + 2) · Γ(n + 2)

=(ni=1 xi + 1)(n + 1 −ni=1 xi)

(n + 3)(n + 2) .

* using Beta function: B(α, β ) = Γ(α)·Γ(β)Γ(α+β) =

1

0 xα−1 · (1 − x)β−1dx, where α =

ni=1 xi + 2,

β = n + ni=1 xi + 2, α + β = n + 4.

** using Gamma function: Γ(α) = (α − 1) · Γ(α− 1).Alternatively, using first to moments of the beta distribution (see Formulae and Tables page 13)we have:

( p(1− p))B

= E [ p(1− p)|X ]

= E [ p|X ]− E p2|X

∗=

ni=1 xi + 1

n + 2 − Γ(a + b) · Γ(a + 2)

Γ(a) · Γ(a + b + 2)

=

ni=1 xi − 1

n− 2 − (a + 1) · a

(a + b + 1)(a + b)

=

(ni=1 xi + 1)(n + 1

−ni=1 xi)

(n + 3)(n + 2) ,

* where a = ni=1 xi + 1 and b = n + 1 −ni=1 xi

(c) We are interested in the Bayesian estimator of p(1 − p), since np(1 − p) is the variance of thebinomial distribution (with n a known constant) and we can use this for the normal approximation.

10. The common distribution function is given by:

F X (x) =

x1

αu−(α+1)du =−u−α

x1

= 1− x−α, if x > 1,

and zero otherwise. The distribution function of Y n will be:

F Y n (x) = Pr (Y n ≤ x) = Pr 1

n1/αX (n) ≤ x

= Pr

X (n) ≤ n1/αx

=

1−

n1/αx

−αn=

1− x−α

n

n,





if x > 1 and zero otherwise. Notice that whereas x > 1, due to the transformation Y n = X(n)n1/α

y > 0,

i.e., when α is close to zero n1/α is large! Taking the limit as n →∞, we have:

limn→∞F Y n (x) = lim

n→∞

1− x−α

n

n= exp

−x−α

.

Thus, limit exists and therefore converges in distribution. The limiting distribution is:

F Y n (y) = exp −y−α , for y > 0,

and zero otherwise, the corresponding density is:

f Y n (y) = ∂F Y n (y)

∂y = αy−(α−1) exp

−y−α

, if y > 0,

and zero otherwise. You can prove that this is a legitimate density by f Y n(y) ≥ 0 for all y, becauseα > 0, y−α+1 ≥ 0 and exp(−y−α) ≥ 0 and F Y n(∞) =

∞−∞ f Y n(y)dy = exp(−0) = 1.

11. The mean and the variance of S are respectively:

E [S ] = 40

3 and V ar (S ) =

10

9 .

Thus, using the central limit theorem, we have:

Pr (S ≤ 10) = Pr

S − E [S ]

V ar (S )≤ 10− (40/3)

10/9

≈ Pr

Z ≤ −

√ 10

= Pr (Z ≤ −3.16) = 0.0008.

12. Note that X can be interpreted as a geometric random variable where k is the total number of trials.Here E [X ] = 1

p.

(a) The method of moments estimator is given by:

X = 1 p

⇒ p = 1

X

= nni=1

X i

(b) The likelihood function is:

L( p; x) =

n

i=1

f X(xi) =

n

i=1

p(1− p)xi−1

= pn(1 − p)

ni=1 xi−n.

The log-likelihood function is:

ℓ( p; x) = log (L( p; x)) =ni=1

log(f X(xi)) = n log( p) +

ni=1

xi − n

· log(1 − p).

Take the FOC of ℓ( p; x) wrt p and equate equal to zero:

ℓ′( p) = n

p −

ni=1 xi − n

1− p = 0.

The we obtain the Maximum Likelihood estimator for p:

p ∗= nni=1 X i − n + n

= nni=1 X i

,

* using: a1−a = bc ⇒ 1

1/a−1 = bc ⇒ 1a − 1 = cb ⇒ 1

a = c+bb ⇒ a = bb+c .





13. For the Pareto distribution with parameters x0 and θ we have the following p.d.f.:

f (x) = θ (x0)θ x−θ−1, x ≥ x0, θ > 1,

and zero otherwise. The expected value of the random variable X is then given by:

E [X ] =

∞−∞

xf X(x)dx =

∞x0

xθ (x0)θ x−θ−1dx

= θ (x0)θ· ∞x0 x−θdx

= θ (x0)θ

x1−θ

1− θ

∞x0

= − θ

1− θx0

= θ

θ − 1x0.

(a) Given x0, we have E [X ] = θθ−1 x0, thus:

θ

θ−

1x0 =X

⇒ x0θ =X (θ− 1)

x0θ =Xθ −X

⇒ X =

X − x0

θ

⇒ θ = X

X − x0

.

Thus the method of moment estimator of θ is XX−x0 .

(b) The likelihood function is given by:

L(θ; x) =n

i=1

f X(xi) =n

i=1

θ (x0)θ x−θ−1i

= θn (x0)nθni=1

x−θ−1i .

The log-likelihood function is given by:

ℓ(θ; x) =log(L(θ; x)) =ni=1

log(f X(xi))

=n log(θ) + nθ log(x0)− (θ + 1)

ni=1

log(xi).

Take the FOC of ℓ(θ; x) and equate equal to zero:

∂ℓ(θ)

∂θ =

n

θ + n log(x0)−

ni=1

log(xi) = 0

⇒ n

θ =− n log(x0) +

ni=1

log(xi)

⇒ θ = n

−n log(x0) +ni=1

log(xi).

Thus, the maximum likelihood estimator for θ is given by n

−n log(x0)+ni=1

log(xi).

14. The p.d.f. of a chi-squared distribution with one degree of freedom:

f Y (y) = exp(−y/2)√

2πy , if y > 0,







ii) Determine the inverse of the transformations:

V = G U = n1 · F · V /n2 = n1 · F ·G/n2.

iii) Calculate the absolute value of the Jacobian:

J = det

0 1g · n1n2

f · n1n2

= g · n1

n2.

iv) Determine the joint probability density function of F and G:

f FG(f, g) = 1

|J | · f UV (u, v) ∗=

1

|J | · f U (u) · f V (v)

=n1g

n2· u(n1−2)/2

2n1/2Γ(n1/2) · exp(−u/2) · v(n2−2)/2

2n2/2Γ(n2/2) · exp(−v/2)

∗∗=

n1g

n2· (f n1g/n2)(n1−2)/2

2n1/2Γ(n1/2) · exp

−f n1g

2n2

· g(n2−2)/2

2n2/2Γ(n2/2) · exp(−g/2)

∗∗∗= n1 · (f n1)(n1−2)/2 · (g)(n1+n2−2)/2

nn1/22 · 2n1/2Γ(n1/2)

· exp

−g

1

2 +

f n1

2n2

· 1

2n2/2Γ(n2/2)

* using independence between U and V , ** using inverse transformation, determined in step ii), and*** using exp(ga) · exp(gb) = exp(g(a + b)) and ab · ac = ab+c.v) Calculate the marginal distribution of F by integrating over the other variable:

f F (f ) =

∞0

f FG(f, g)dg

= 1

2n2/2Γ(n2/2) · n1 · (f n1)(n1−2)/2

nn1/22 · 2n1/2Γ(n1/2)

· ∞

0

g(n1+n2−2)/2 · exp

−g

1

2 +

f n1

2n2

dg

∗=

1

2n2/2Γ(n2/2) · n1 · (f n1)(n1−2)/2

nn1/22 · 2n1/2Γ(n1/2)

·

2n2

n2 + f n1

(n1+n2−2)/2

· 2n2

n2 + f n1

· ∞0

x(n1+n2−2)/2 · exp(−x) dx

∗∗=

1

2n2/2Γ(n2/2) · n1 · (f n1)(n1−2)/2

nn1/22 · 2n1/2Γ(n1/2)

·

2n2

n2 + f n1

(n1+n2−2)/2

· 2n2

n2 + f n1

· Γ((n1 + n2)/2)

= 1

2(n1+n2)/2 · f (n1−2)/2 · n

(n1)/21

nn1/22

·

2n2

n2 + f n1

(n1+n2)/2

· Γ((n1 + n2)/2)

Γ(n1/2) · Γ(n2/2)

=f (n1−2)/2 · n

(n1)/21

nn1/22

·

n2

n2 + f n1

(n1+n2)/2

· Γ((n1 + n2)/2)

Γ(n1/2) · Γ(n2/2)

=f (n1−2)/2

·n(n1)/2

1 ·n(n2)/2

2 ·(n

2 + f n

1)−(n1+n2)/2

· Γ((n1 + n2)/2)

Γ(n1/2) · Γ(n2/2)

=nn1/21 · n

n2/22 · Γ((n1 + n2)/2)

Γ(n1/2) · Γ(n2/2) · f n1/2−1

(n2 + f n1)(n1+n2)/2

* using transformation x =

12

+ fn12n2

g and thus g = 2n2

n2+fn1x and we have dx =

12

+ fn12n2

dg,

and ** using Γ(α) = ∞

0 xα−1 exp(−x)dx.

-End of Week 5 Tutorial Solutions-


Documents

Tutorial 5 So Ln