Upload
others
View
4
Download
0
Embed Size (px)
Citation preview
STAT:5100 (22S:193) Statistical Inference IHomework Assignments
Luke Tierney
Fall 2015
Statistics STAT:5100 (22S:193), Fall 2015 Tierney
Assignment 1
Due on Monday, August 31, 2015.
1. For each of the following experiments, describe a reasonable sample space:
(a) Toss a coin four times.
(b) Count the number of insect-damaged leaves on a plant.
(c) Measure the lifetime (in hours) of a particular brand of light bulb.
(d) Three people arrive at an airport checkpoint. Two of the three are ran-domly chosen to complete a survey.
2. The set-theoretic difference A\B = A ∩ Bc is the set of all elements in A thatare not in B. The symmetric difference A∆B = (A\B) ∪ (B\A) is the set ofall elements in either A or B but not both. Verify the following identities:
(a) A\B = A\(A ∩B)(b) A∆B = Ac∆Bc
(c) A ∪B = A ∪ (B\A)(d) B = (B ∩ A) ∪ (B ∩ Ac)
3. Problem 1.4 in the textbook
4. Problem 1.5 in the textbook
5. Problem 1.13 in the textbook
Solutions
1. (a) Toss coin 4 times:
{(H,H,H,H), . . .} = {(x1, x2, x3, x4) : xi ∈ {H,T}}
or{0, 1, 2, 3, 4}
(b) Count number of insect-damaged leaves:
{0, 1, . . . , N} N = # leaves (or upper bound){0, 1, 2, . . .} if no upper bound is available
1
Statistics STAT:5100 (22S:193), Fall 2015 Tierney
(c) Measure lifetime in hours:
{0, 1, 2, . . .} if rounded (can put in upper limit)[0,∞) if fractional hours are allowed
(d) Two out of three people chosen to complete a survey: Suppose the peopleare labeled A, B, and C. One possible sample space is is the collection ofall subsets of size 2 that can be chosen from the set {A,B,C}:
{{A,B}, {A,C}, {B,C}}.
Another possibility is the collection of all ordered pairs that can be formed:
{(A,B), (B,A), (A,C), (C,A), (B,C), (C,B)}.
2. (a) A\B is defined as A ∩Bc. To see that A\B = A\(A ∩B):
A\(A ∩B) = A ∩ (A ∩B)c
= A ∩ (Ac ∪Bc) De Morgan’s law= (A ∩ Ac) ∪ (A ∩Bc) distributive law= ∅ ∪ (A ∩Bc)= A\B
(b) A∆B = Ac∆Bc: For any two sets A and B
A\B = A ∩Bc
= Bc ∩ A commutative law= Bc ∩ (Ac)c
= Bc\Ac
So
A∆B = (A\B) ∪ (B\A)= (Ac\Bc) ∪ (Bc\Ac)= Ac∆Bc
(c) A ∪B = A ∪ (B\A):
A ∪B = A ∪ ((B ∩ A) ∪ (B ∩ Ac)) by part (d)= (A ∪ (B ∩ A)) ∪ (B ∩ Ac) associative law= A ∪ (B ∩ Ac)
2
Statistics STAT:5100 (22S:193), Fall 2015 Tierney
(d) B = (B ∩ A) ∪ (B ∩ Ac):
(B ∩ A) ∪ (B ∩ Ac) = B ∩ (A ∪ Ac) distributive law= B ∩ S= B
3. (a) A or B or both:
P (A ∪B) = P (A) + P (B)− P (A ∩B)
(b) A or B but not both:
A∆B = (A ∪B)\(A ∩B)= (A ∩Bc) ∪ (B ∩ Ac)
and the sets A ∩Bc and B ∩Ac are disjoint. Since A ∩B and A ∩Bc aredisjoint and their union is A,
P (A) = P (A ∩B) + P (A ∩Bc)
and therefore P (A ∩ Bc) = P (A) − P (A ∩ B). Similarly P (B ∩ Ac) =P (B)− P (A ∩B) and therefore
P (A∆B) = P (A) + P (B)− 2P (A ∩B)
(c) At least one of A or B: A ∪B.(d) At most one of A or B = not A ∩B = (A ∩B)c:
P ((A ∩B)c) = 1− P (A ∩B)
4. The event A ∩ B ∩ C is the event that the birth results in identical twins whoare female. The proportion of births satisfying this description is
P (A ∩B ∩ C) = 190× 1
3× 1
2=
1
540= 0.001852.
5. If A and B are disjoint then
P (A ∪B) = P (A) + P (B) ≤ 1,
but
P (A) + P (B) = P (A) + 1− P (Bc) = 13
+ 1− 14
=13
12> 1,
so A and B cannot be disjoint if P (A) = 13
and P (Bc) = 14.
Another way to see that this is not possible: If A and B are disjoint then A ⊂ Bcand therefore P (A) ≤ P (B), but the opposite is true.
3
Statistics STAT:5100 (22S:193), Fall 2015 Tierney
Assignment 2
Due on Wednesday, September 9, 2015.
1. Problem 1.14 from the textbook
2. Suppose n balls are placed at random in n cells; cells can contain more thanone ball.
(a) Show that the probability that exactly one cell remains empty is
n(n− 1)(n2
)(n− 2)!
nn=
(n2
)n!
nn
(b) The R function defined as
sim1
Statistics STAT:5100 (22S:193), Fall 2015 Tierney
You can show it algebraically using the binomial theorem
(a+ b)n =n∑k=0
(n
k
)akbn−k
taking a = b = 1.
You can also use one of several counting arguments. One such argument isinductive ans starts with the number of subsets of the empty set, which is x0 = 1.Now let xn be the number of subsets of a set of n items An = {a1, a2, . . . , an}.A subset of An+1 = {a1, a2, . . . , an+1} is either also a subset of An+1 is eitherslso a subset An or it is the union of a subset of An with {an+1}. There are xnsubsets of each type, so the number of subsets of An+1 is xn+1 = 2xn,
Another conting arguments uses the fact that each subset of a setAn = {a1, . . . , an}corresponds to an ordered list [x1, . . . , xn] of n 0’s and 1’s, with
xi = 1 if the ai is in the subset.
xi = 0 if the ai is not in the subset.
For example, if A4 = {a1, a2, a3, a4}, then
(1, 0, 0, 1)↔ {a1, a4}
There are 2n such lists.
2. (a) n balls are assigned at random into n cells. S has nn elements since multipleballs per call are allowed.
Assume equally likely outcomes. Equivalently, assume balls are assignedindependently.
Exactly one cell empty means:
• one empty cell• one with two• n-2 with 1
Choices:
• n for the one empty cell• n− 1 for the cell with two balls•(n2
)for the balls to use for the two-ball cell.
• (n− 2)! arrangements for the other balls in their cells.So the number of ways to get one empty is
n(n− 1)(n
2
)(n− 2)! =
(n
2
)n!
The probability of this arrangement is(n2
)n!
nn
5
Statistics STAT:5100 (22S:193), Fall 2015 Tierney
(b) We can define a function sim1empty to compute the probability of oneempty cell by simulation using N simulation replicates:
sim1empty
Statistics STAT:5100 (22S:193), Fall 2015 Tierney
> f f(20)
[1] 137846528820
> choose(40, 20)
[1] 137846528820
The identityn∑k=0
(n
k
)2=
(2n
n
)can be verified using induction, but a counting argument is simpler: Considera box containing n red and n blue balls and consider selecting a sample of nballs. There are
(2nn
)such samples. A sample has to contain some number k of
red balls and n − k blue balls. The number of samples with k red and n − kblue balls is (
n
k
)(n
n− k
)=
(n
k
)2.
So the total number of samples of size n from a set of size 2n satisfies(2n
n
)=
n∑k=0
(n
k
)(n
n− k
)=
n∑k=0
(n
k
)2.
Another approach to counting the number of outcomes in which the playershave the same number of heads: We have 2n slots that are to be filled in witha head or a tail; the first n corresponding to player A and the second to playerB. Chose n of these slots; this can be done in
(2nn
)ways. Some number, say k of
the chosen slots will be in the first half. Make these heads, and the remainingn− k slots in the first half tails. For the n− k chosen slots in the second half,make those tails and the remaining k heads. Then the result contains k headsin the first half and k heads in the second half. Every assignment of heads andtails with the same number of heads in each half corresponds in this way toa unique selection of n out of 2n slots, so there are the same number of suchassignments as there are ways to choose n slots out of 2n slots,
(2nn
).
5. There are 6k possible outcomes for the k rolls. Outcomes with the m-th 6 onroll k must have a 6 on roll k (one choice) and m − 1 in the first k − 1 rolls;there are
(k−1m−1
)ways to choose the positions for these 6 rolls, and , given these
positions, there are 5k−m ways to choose the results for the remaining rolls. Sothe probability of the m-th 6 on roll k is(
k−1m−1
)5k−m
6k=
(k − 1m− 1
)(1
6
)m(1− 1
6
)k−m
7
Statistics STAT:5100 (22S:193), Fall 2015 Tierney
Assignment 3
Due on Monday, September 14, 2015.
1. Problem 1.12 (b) from the textbook
2. Problem 1.24 from the textbook
3. Problem 1.34 from the textbook
4. Problem 1.36 from the textbook
5. An urn contains 11 balls numbered 0, 1, . . . , 10. A ball is selected at random.Suppose the number on the selected ball is k. A second urn is filled with k redballs and 10− k blue balls. Five balls are selected at random with replacementfrom the second urn.
(a) Find the probability that the sample from the second urn consists of threered and two blue balls.
(b) Given that the sample from the second urn consists of three red and twoblue balls, find the conditional probability that the ball selected from thefirst urn had the number k = 6.
Solutions
1. Let A1, A2, . . . be pairwise disjoint. For each i, let
Bi =∞⋃j=i
Aj
Then for each n,∞⋃i=1
Ai = A1 ∪ · · · ∪ An ∪Bn+1
Since A1, A2, . . . , An, Bn+1 are pairwise disjoint,
P
(∞⋃i=1
Ai
)=
n∑i=1
P (Ai) + P (Bn+1)
for every n by finite additivity. But
B1 ⊃ B2 ⊃ · · ·
8
Statistics STAT:5100 (22S:193), Fall 2015 Tierney
and⋂Bi = ∅. So by continuity, P (Bn+1) ↓ 0. So
n∑i=1
P (Ai) = P
(∞⋃i=1
Ai
)− P (Bn+1)→ P
(∞⋃i=1
Ai
)and thus
∞∑i=1
P (Ai) = P
(∞⋃i=1
Ai
)An alternative approach is to show, using finite additivity and the continuityaxiom as stated, that for any set of events B1, B2, . . . that satisfy B1 ⊂ B2 ⊂B3 ⊂ . . . the identity
P
(∞⋃i=1
Bi
)= lim
n→∞P (Bn)
is true. Then let
Bi =i⋃
j=1
Aj.
These Bi satisfy B1 ⊂ B2 ⊂ B3 ⊂ . . . and∞⋃i=1
Ai =∞⋃i=1
Bi.
By finite additivity
P (Bi) =i∑
j=1
P (Ai)
and by the result just stated
i∑j=1
P (Ai) = P (Bi)→ P
(∞⋃i=1
Bi
)= P
(∞⋃i=1
Ai
).
2. Let Ei be the event that the first head appears on toss i and let A be the eventthat player A wins. Then
A = E1 ∪ E3 ∪ E5 ∪ . . .
The Ei are pairwise disjoint, and P (Ei) = (1− p)i−1p, so
P (A) = p+ (1− p)2p+ (1− p)4p+ . . .
= p∞∑k=1
(1− p)k−1
=p
1− (1− p)2
=p
1− (1− 2p+ p2)=
p
2p− p2=
1
2− p
9
Statistics STAT:5100 (22S:193), Fall 2015 Tierney
So if p = 12
then P (A) = 23, and for all p
P (A) =1
2− p≥ 1
2.
An alternative approach to deriving the formula for P (A) is to condition on theresults of the first two tosses. For the first toss,
P (A) = pP (A|E1) + (1− p)P (A|Ec1)= p+ (1− p)P (A|Ec1)
For the second toss
P (A|Ec1) = pP (A|E2 ∩ Ec1) + (1− p)P (A|Ec2 ∩ Ec1)= (1− p)P (A|Ec2 ∩ Ec1)
Since the tosses are independent, the game starts over if the first two tosses aretails, i.e.
P (A|Ec2 ∩ Ec1) = P (A)
So we have an equation in P (A):
P (A) = p+ (1− p)2P (A)
and the solution is
P (A) =1
2− p≥ 1
2.
3.
I : B,B,G P (B|I) = 2/3II : B,B,B,G,G P (B|II) = 3/5
P (I) = P (II) = 1/2 (choose litter at random).
a. P (B) = P (B|I)P (I) + P (B|II)P (II) = 23× 1
2+ 3
5× 1
2= 1
3+ 3
10= 19
30
b. P (I|B) = P (I∩B)P (B)
=23× 1
21930
=131930
= 1019
4. The probabilities of no hits and exactly one hits are:
P (not hit) =
(4
5
)10≈ 0.1074
P (hit once) = 10× 15×(
4
5
)9≈ 0.2684
10
Statistics STAT:5100 (22S:193), Fall 2015 Tierney
Therefore
P (at least twice) = 1− P (not hit)− P (hit once)
= 1−(
4
5
)10− 10× 1
5×(
4
5
)9≈ 0.6242
and
P (at least twice|at least once) = P (at least twice)P (at least once)
=1−
(45
)10 − 10× 15×(45
)91−
(45
)10 ≈ 0.69935. (a) Given that ball k is chosen from the first urn, the probability of choosing
three red and two blue balls from the second when sampling with replace-ment is the binomial probability
P (three red|ball k) =(
5
3
)(k
10
)3(1− k
10
)2.
The probability of choosing ball k from the first urn and three red ballsfrom the second is therefore
P (three red and ball k) = P (three red|ball k)P (ball k)
=
(5
3
)(k
10
)3(1− k
10
)2× 1
11,
and the unconditional probability of choosing three red balls from thesecond urn is
P (three red) =10∑k=0
P (three red and ball k)
=10∑k=0
(5
3
)(k
10
)3(1− k
10
)2× 1
11.
≈ 0.1515
This can be computed in R as
> sum(dbinom(3, 5, (0 : 10) / 10) / 11)
[1] 0.1515
(b) The conditional probability that the chosen ball from the first urn wasnumbered k = 6, given that three red balls were chosen from the second,
11
Statistics STAT:5100 (22S:193), Fall 2015 Tierney
is
P (ball k = 6|three red) = P (three red and ball k = 6)P (three red)
=
(53
) (k10
)3 (1− k
10
)2 × 111
P (three red)
≈ 0.031420.1515
≈ 0.2074
This can be computed as
> (dbinom(3, 5, 6 / 10) / 11) / (sum(dbinom(3, 5, (0 : 10) / 10)) / 11)
[1] 0.2073807
The probabilities for all k can be computed and graphed as
k
Statistics STAT:5100 (22S:193), Fall 2015 Tierney
Assignment 4
Due on Monday, September 21, 2015.
1. A coin has probability p of coming up heads and 1− p of tails, with 0 < p < 1.An experiment is conducted with the following steps:
1. Flip the coin.
2. Flip the coin a second time.
3. If both flips land on heads or both land on tails return to step 1.
4. Otherwise let the result of the experiment be the result of the last flip atstep 2.
Assume flips are independent.
(a) The R function
sim1
Statistics STAT:5100 (22S:193), Fall 2015 Tierney
Solutions
1. (a) One possible approach:
> sapply(seq(0.2, 0.9, by = 0.2),
function(p) mean(replicate(10000, sim1(p))))
[1] 0.4913 0.4965 0.5034 0.4991
This suggests that the probability of heads may be 0.5 for any p.
(b) Let A be the event that the process returns a head, and let B be the eventthat the process ends after the first two flips. Then
P (A) = P (A ∩B) + P (A|Bc)P (Bc).
Now A ∩ B is the event that the first toss is a tail and the second toss isa head, so P (A ∩ B) = (1 − p)p. B is the event that either the first tossis a head and the second a tail, or the first is a tail and the second is ahead; so P (B) = 2p(1− p) and P (Bc) = 1− 2p(1− p). If the process doesnot end with the first two tosses then it starts over again independently,so P (A|Bc) = P (A). Therefore P (A) satisfies
P (A) = p(1− p) + P (A)(1− 2p(1− p))
and thus
P (A) =p(1− p)2p(1− p)
=1
2,
as the simulation in part (a) suggests. The requirement that p > 0 andp < 1 ensures that the denominator is positive and that the process isguaranteed to end.
2. (i) PX(A) = P (X ∈ A) ≥ 0 since P is a probability.(ii) PX(R) = P (X ∈ R) = P (S) = 1 since P is a probability.
(iii) For Borel sets A1, A2, . . . that are pairwise disjoint,
PX
(⋃Ai
)= P
(X ∈
⋃Ai
)= P
(⋃{X ∈ Ai}
)=∑
P (X ∈ Ai)
=∑
PX(Ai)
since Bi = {X ∈ Ai} are pairwise disjoint.
3. All functions in (a)–(d) are continuous and therefore right continuous. Wetherefore only need to check that they are nondecreasing and have the rightlimits at ±∞.
14
Statistics STAT:5100 (22S:193), Fall 2015 Tierney
(a) For all x ∈ Rd
dx
(1
2+
1
πtan−1(x)
)=
1
π
1
1 + x2> 0
so F (x) is increasing, and
limx→−∞
1
2+
1
πtan−1(x) =
1
2+
1
π
(−π
2
)= 0
limx→∞
1
2+
1
πtan−1(x) =
1
2+
1
π
π
2= 1.
So F (x) is a CDF.
(b) For all x ∈ Rd
dx(1 + e−x)−1 =
e−x
(1− e−x)2> 0
so F (x) is increasing, and
limx→−∞
(1 + e−x)−1 = (1 +∞)−1 = 0
limx→∞
(1 + e−x)−1 = (1 + 0)−1 = 1.
So F (x) is a CDF.
(c) For all x ∈ Rd
dxe−e
−x= e−xe−e
−x> 0
so F (x) is increasing, and
limx→−∞
e−e−x
= e−∞ = 0
limx→∞
e−e−x
= e0 = 1.
So F (x) is a CDF.
(d) F (x) = 0 for x ≤ 0. For x > 0
d
dx(1− e−x) = e−x > 0
so F (x) is nondecreasing, and
limx→∞
(1− e−x) = 1− 0 = 1.
So F (x) is a CDF.
15
Statistics STAT:5100 (22S:193), Fall 2015 Tierney
(e) The function is continuous everywhere except possibly at the origin, andat the origin it is right continuous because of the placement of the equalitysign in the definition. The function is increasing for y < 0 and for y > 0from part (b). The function value at the origin is
F (0) = �+1− �
2>
1− �2
= F (0−),
so F is increasing everywhere. Using the limit results from part (b)
limy→−∞
F (y) = (1− �)× 0 = 0
limx→∞
F (y) = �+ (1− �)× 1 = 1.
So F (x) is a CDF.
4. The set of possible values is X = {0, 1, 2, 3, 4}, and the PMF is given by
fX(x) =
(5x
)(254−x
)(304
)A table of the probabilities is
0 1 2 3 4p 0.4616 0.4196 0.1095 0.0091 0.0002
and a plot of the CDF is
−1 0 1 2 3 4 5
0.0
0.2
0.4
0.6
0.8
1.0
CDF for Number of Defectives
x
f(x)
R code was used to create the table and plot the CDF is available
16
http://www.stat.uiowa.edu/~luke/classes/193/1-51.R
Statistics STAT:5100 (22S:193), Fall 2015 Tierney
5. The function
g(x) =
{f(x)
1−F (x0) x ≥ x00 x < x0
represents the conditional density of X given that X ≥ x0. Since f(x) ≥ 0 wealso have g(x) ≥ 0. Furthermore,∫ ∞
−∞g(x)dx =
∫ ∞x0
g(x)dx =
∫∞x0f(x)dx
1− F (x0)
=P (X > x0)
1− F (x0)=
1− F (x0)1− F (x0)
= 1
So g(x) is a PDF.
17
Statistics STAT:5100 (22S:193), Fall 2015 Tierney
Assignment 5
Due on Monday, September 28, 2015.
1. Let X be a non-negative, integer-valued random variable with probability massfunction pn = P (X = n) for n = 0, 1, . . . . The probability generating functionof X is defined as
G(t) =∞∑n=0
tnpn
for |t| ≤ 1.
(a) Show that pn can be recovered from the value of the n-th derivative of G(t)at t = 0. (The zero-th derivative of G(t) is G(t).)
(b) Suppose X is the number of heads in n independent flips of a biased coinwith probability of heads equal to p. X has a binomial distribution. Findthe probability generating function of X.
(c) Suppose Y is the number of independent tosses of a biased coin with withprobability p of heads needed until the first head is obtained. Y has ageometric distribution. Find the probability generating function of Y .
2. Problem 2.2 from the textbook
3. Problem 2.6 from the textbook
4. Problem 2.8 from the textbook
Solutions
1. (a) The derivatives are
G′(t) =∞∑n=1
ntn−1pn
G′′(t) =∞∑n=2
n(n− 1)tn−2pn
...
G(k)(t) =∞∑n=k
n!
(n− k)!tn−2pn.
18
Statistics STAT:5100 (22S:193), Fall 2015 Tierney
At t = 0 all terms except the first are zero, so
G(0) = p0
G′(0) = p1
G′′(0) = 2p2...
G(k)(0) = k! pk.
So pk = G(k)(0)/k!. This is the reasonG is called the probability generating
function.
(b) For the binomial distribution
G(t) =n∑k=0
tk(n
k
)pk(1− p)n−k =
n∑k=0
(n
k
)(tp)k(1− p)n−k = (tp+ 1− p)n
by the binomial theorem.
(c) For the geometric distribution
G(t) =∞∑n=1
tnp(1− p)n−1 = tp∞∑n=1
[t(1− p)]n−1 = tp1− t(1− p)
.
2. The change of variables formula for smooth monotone transformations can beapplied in all three cases.
(a) Y = [0, 1], g−1(y) = √y, and for y ∈ [0, 1]
fY (y) =1
2√y
(b) Y = (0,∞), g−1(y) = e−y, and for y > 0
fY (y) =(n+m+ 1)!
n!m!e−yn(1−e−y)m|−e−y| = (n+m+ 1)!
n!m!e−y(n+1)(1−e−y)m
(c) Y = (1,∞), g−1(y) = log y, and for y > 1
fY (y) =1
σ2log(y)
ye−((log y)/σ)
2/2
3. All three fit into the framework of Theorem 2.1.8.
(a) Let A1 = (−∞, 0) and A2 = (0,∞). On A1, g1(x) = |x|3 = −x3, and onA2, g2(x) = |x|3 = x3. The range is Y = (0,∞). So for y > 0
fY (y) =1
2e−y
1/3 1
3| − y−2/3|+ 1
2e−y
1/3 1
3y−2/3 =
1
3y−2/3e−y
1/3
19
Statistics STAT:5100 (22S:193), Fall 2015 Tierney
(b) Let A1 = (−1, 0) and A2 = (0, 1). The range of Y is Y = (0, 1). Theng1(x) = 1− x2, g2(x) = 1− x2, g−11 (y) = −
√1− y, and g−12 =
√1− y. So
for y ∈ (0, 1)
fY (y) =3
8
(1−
√1− y
)2 12
1√1− y
+3
8
(1 +
√1− y
)2 12
1√1− y
=3
8(1− y)−1/2 + 3
8(1− y)1/2
(c) Y , A1, A2, and g1 are as in the previous part; g2(x) = 1−x and g2−1(y) =1− y. So for y ∈ (0, 1)
fY (y) =3
8
(1−
√1− y
)2 12
1√1− y
+3
8(1 + 1− y)2
=3
16
(1−
√1− y
)2 1√1− y
+3
8(2− y)2
4. F−1(y) = inf{x : F (x) ≥ y}
(a)
F (x) =
{0 x < 0
1− e−x x ≥ 0F−1(y) =
{−∞ y = 0− log(1− y) 0 < y ≤ 1
(b)
F (x) =
12ex x < 0
12
0 ≤ x < 11− 1
2e1−x x ≥ 1
F−1(y) =
{log 2y 0 ≤ y ≤ 1/21− log(2(1− y)) 1/2 < y ≤ 1
(c)
F (x) =
{14ex x < 0
1− 14e−x x ≥ 0
F−1(y) =
log(4y) 0 ≤ y < 1/40 1/4 ≤ y < 3/4− log(4(1− y)) 3/4 ≤ y ≤ 1
20
Statistics STAT:5100 (22S:193), Fall 2015 Tierney
Assignment 6
Due on Monday, October 5, 2015.
1. Problem 2.11 from the textbook
2. Problem 2.13 from the textbook
3. Let X be a non-negative random variable with CDF F . Show that
E[X] =
∫ ∞0
(1− F (t))dt.
Hint: Argue that you can write X =∫∞0
1{t
Statistics STAT:5100 (22S:193), Fall 2015 Tierney
Using the PDF of Y = X2:
E[X2] = E[Y ] =
∫ ∞0
y1√2π
1√ye−y/2dy
=1√2π
∫ ∞0
√ye−y/2dy
=23/2√
2π
∫ ∞0
z3/2−1e−zdz
=23/2√
2πΓ(3/2) =
2√π
Γ(3/2)
=2√π
Γ(1/2)1
2=
Γ(1/2)√π
= 1
(b) The density of Y = |X| is
fY (y) = fX(−y) + fX(y) = 2e−y2/2 1√
2π=
√2
πe−y
2/2
for y ≥ 0 and fY (y) = 0 for y < 0. The mean is
E[Y ] = E[|X|] = 2∫ ∞0
ye−y2/2 1√
2πdy
= 2
[− 1√
2πe−y
2/2
∣∣∣∣∞0
]= 2
1√2π
=
√2
π
The second noncentral moment is
E[Y 2] = E[X2] = 1
so the variance is
Var(Y ) = 1− 2π
2. The possible values of X are X = {1, 2, 3, . . . }. The probability mass functionof X is
fX(x) = P (X = x)
= P (first x are H, followed by a T) + P (first x are T, followed by an H)
= px(1− p) + (1− p)xp
22
Statistics STAT:5100 (22S:193), Fall 2015 Tierney
for x ∈ X . So the mean is
E[X] =∞∑x=1
x(px(1− p) + (1− p)xp)
=
[∞∑x=1
xpx(1− p)
]+
[∑x=1
x(1− p)xp)
]
= p
[∞∑x=1
xpx−1(1− p)
]+ (1− p)
[∑x=1
x(1− p)x−1p)
]
The sums in square brackets are the means of geometric random variables withsuccess probabilities 1− p and p, respectively, so
E[X] =p
1− p+
1− pp
=p2 + (1− p)2
p(1− p)
3. For a non-negative random variable X we can write
X =
∫ X0
1dt =
∫ ∞0
1{t t)dt = a
∫ ∞0
e−λtdt+ (1− a)∫ ∞0
e−µtdt
=a
λ
∫ ∞0
λe−λtdt+1− aµ
∫ ∞0
µe−µtdt
=a
λ+
1− aµ
5. The n-th moment of this density is
E[Xn] =
∫ ∞1
xnα
xα+1dx =
∫ ∞1
αxn−α−1dx
=
{[α
n−αxn−α]∞
1for α 6= n
[α log(x)]∞1 for α = n=
{∞ for α ≤ nα
α−n for α > n.
23
Statistics STAT:5100 (22S:193), Fall 2015 Tierney
So
E[X] =
{∞ for α ≤ 1αα−1 for α > 1
and
E[X2] =
{∞ for α ≤ 2αα−2 for α > 2.
The variance of X is therefore infinite if α ≤ 2, and is
Var(X) = E[X2]− E[X]2 = αα− 2
−(
α
α− 1
)2=
α
(α− 2)(α− 1)2
for α > 2.
24
Statistics STAT:5100 (22S:193), Fall 2015 Tierney
Assignment 7
Due on Monday, October 12, 2015.
1. Problem 2.17 from the textbook
2. Problem 2.24 from the textbook
3. Problem 2.32 from the textbook
4. Problem 2.33 from the textbook
5. Problem 2.38 from the textbook
6. Problem 2.40 from the textbook
Solutions
1. (a) Over the range x ∈ [0, 1] the CDF is
F (x) =
∫ x0
3y2dy = x3
is strictly increasing, so there is a unique median m that solves
F (m) = m3 = 1/2
The solution is m = 12
1/3 ≈ 0.7937(b) The density (a Cauchy density) is symmetric around the origin, so
P (X ≤ 0) = P (X ≥ 0) = 1/2
and therefore m = 0 is a median. Since the density is positive the CDF isstrictly increasing and the median is unique.
2. (a) f(x) = axa−1, 0 < x < 1, a > 0.
E[X] =
∫ 10
axadx =a
a+ 1
E[X2] =
∫ 10
axa+1dx =a
a+ 2
Var(X) =a
a+ 2−(
a
a+ 1
)2=
a
(a+ 2)(a+ 1)2
25
Statistics STAT:5100 (22S:193), Fall 2015 Tierney
(b) f(x) = 1n, x = 1, 2, . . . , n.
E[X] =n∑i=1
i
n=
1
n
n∑i=1
i =1
n
n(n+ 1)
2=n+ 1
2
E[X2] =n∑i=1
i2
n=
1
n
n∑i=1
i2 =1
n
n(n+ 1)(2n+ 1)
6=
(n+ 1)(2n+ 1)
6
Var(X) =(n+ 1)(2n+ 1)
6−(n+ 1
2
)2= (n+ 1)
(2n+ 1
6− n+ 1
4
)= (n+ 1)
4n+ 2− 3n− 312
= (n+ 1)n− 1
12=n2 − 1
12
(c) f(x) = 32(x− 1)2, 0 < x < 2.
E[X] =
∫ 20
x3
2(x− 1)2dx = 1
Var(X) = E[(X − 1)2]
=
∫ 20
3
2(x− 1)4dx
=3
2
(x− 1)5
5
∣∣∣∣20
=3
2× 2
5=
3
5
3. The first derivative of S(t) is
S ′(t) =d
dtlogMX(t) =
M ′X(t)
MX(t)
So
S ′(t)|t=0 =M ′X(0)
MX(0)=E[X]
1= E[X].
By the quotient rule the second derivative of S(t) is
S ′′(t) =MX(t)M
′′X(t)−M ′X(t)2
MX(t)2
and therefore
S ′′(t)|t=0 =MX(0)M
′′X(0)−M ′X(0)2
MX(0)2= E[X2]− E[X]2 = Var(X).
4. (a) Done in class.
26
Statistics STAT:5100 (22S:193), Fall 2015 Tierney
(b) Variation on a geometric.
M(t) =∞∑x=0
etxp(1− p)x =
{p
1−et(1−p) t < − log(1− p)∞ otherwise
M ′(t) =p(1− p)et
(1− et(1− p))2
M ′′(t) =(1− et(1− p))2p(1− p)et + 2(1− et(1− p))etp(1− p)2et
(1− et(1− p))4
E[X] =1− pp
E[X2] =p3(1− p) + 2p2(1− p)2
p4=p(1− p) + 2(1− p)2
p2
Var(X) =p(1− p) + (1− p)2
p2=
1− pp2
(c)
M(t) =
∫etx
1√2πσ
e−(x−µ)2
2σ2 dx
=
∫1√2πσ
exp{− x2
2σ2+xµ
σ2− µ
2
2σ2+ tx}dx
=
∫1√2πσ
exp{− x2
2σ2+
x
σ2(µ+ σ2t)− µ
2
2σ2}dx
= exp{− µ2
2σ2+µ2 + 2µσ2t+ σ4t2
2σ2}
= exp{µt+ 12σ2t2}
K(t) = logM(t) = µt+1
2σ2t2
K ′(t) = µ+ σ2t
K ′′(t) = σ2
E[X] = K ′(0) = µ
Var(X) = K ′′(0) = σ2
5. (a) From 2.30(d)
MX(t) =
{(p
1−et(1−p)
)rt < − log(1− p)
∞ otherwise
27
Statistics STAT:5100 (22S:193), Fall 2015 Tierney
(b)
MY (t) = E[etY ] = E[e2ptX ] = MX(2pt)
=
{(p
1−e2pt(1−p)
)r2pt < − log(1− p)
∞ otherwise
Now by L’Hospital’s rule
limp→0− log(1− p)
2p=
limp→01
1−p
limp→0 2=
1
2
and
limp→0
p
1− e2pt(1− p)=
limp→0 1
limp→0(−2te2pt(1− p) + e2pt)=
1
1− 2t
So for t < 12,
MY (t)→(
1
1− 2t
)r=
(1
1− 2t
)2r/2This is the MGF of a χ22r distribution.
6. The result holds only for x = 0, . . . , n − 1. For x = 0 the left hand side is(1− p)n and the right hand side is
n
∫ 1−p0
tn−1dt = tn|t=1−pt=0 = (1− p)n
So the claim hols for x = 0. Suppose the claim is true for y = 0, . . . , x− 1 andx < n. Integration by parts produces
(n− x)(n
x
)∫ 1−p0
tn−x−1(1− t)xdt
=
(n
x
)∫ 1−p0
((n− x)tn−x−1
)(1− t)xdt
=
(n
x
)[tn−x(1− t)x
∣∣t=1−pt=0
+
∫ 1−p0
xtn−x(1− t)x−1dt]
=
(n
x
)px(1− p)n−x + x
(n
x
)∫ 1−p0
tn−(x−1)−1(1− t)x−1dt
=
(n
x
)px(1− p)n−x + (n− (x− 1))
(n
x− 1
)∫ 1−p0
tn−(x−1)−1(1− t)x−1dt
28
Statistics STAT:5100 (22S:193), Fall 2015 Tierney
By the induction hypothesis the second term is
(n− (x− 1))(
n
x− 1
)∫ 1−p0
tn−(x−1)−1(1− t)x−1dt =x−1∑k=0
(n
k
)pk(1− p)n−k
Thus the result folds for all x = 0, . . . , n− 1.Several other approaches are possible.
• Differentiating both sides produces a telescoping series on the left handside.
• Both sides are polynomials of degree n in p. The polynomials are equal ifand only if the coefficients of the powers of p are equal, and these can becalculated by differentiating multiple times and evaluating the derivativesat zero.
• The simplest approach uses a property of the distribution of order statisticsthat we will learn about in Chapter 5:
If U1, . . . , Un are independent standard uniforms and Np is the number ofthese uniforms that are less than or equal to p, then Np is Binomial(n, p),and
P (Np ≤ x) = P (Np < x+ 1) = P (U(x+1) > p)
where U(k) is the k-th order statistic of the sample. Now
P (U(x+1) > p) = P (1− U(x+1) < 1− p) = P (V(n−x) < 1− p)
where V(k) is the k-th order statistic of the sample 1−U1, . . . , 1−Un. Theresult now follows from the fact that the k-th uniform order statistic V(k)for a sample of size n has a Beta(k, n− k + 1) distribution.
29
Statistics STAT:5100 (22S:193), Fall 2015 Tierney
Assignment 8
Due on Monday, October 19, 2015.
1. Problem 3.7 from the textbook
2. Problem 3.12 from the textbook
3. Problem 3.25 from the textbook
4. Problem 3.26 from the textbook
5. Problem 3.28 from the textbook
6. Problem 3.30 from the textbook
Solutions
1. P (X ≥ 2) = 0.99 means
P (X = 0) + P (X = 1) = e−λ + λe−λ = 0.01
The solution is around 6.6. This can be determined graphically or numerically,for example using the R function uniroot.
2. X is Binomial(n, p) and Y is negative binomial(r, p) (zero based, Y countsnumber of failures).
FX(r − 1) = P (r − 1 or fewer successes in n trials)= 1− P (r or more successes in n trials)= 1− P (r-th success on or before n-th trial)= 1− P (number of failures before r-th success ≤ n− r)= 1− FY (n− r)
3.
hT (t) = limδ↓0
P (t ≤ T < t+ δ|T > t)δ
= limδ↓0
1
δ
F (t+ δ)− F (t)1− F (t)
=1δ(F (t+ δ)− F (t))
1− F (t)
=F ′(t)
1− F (t)=
f(t)
1− F (t)
30
Statistics STAT:5100 (22S:193), Fall 2015 Tierney
and
− ddt
log(1− F (t)) = f(t)1− F (t)
The quantity
HT (t) = − log(1− FT (t)) =∫ t0
hT (u)du
is called the cumulative hazard function, and
FT (t) = 1− exp{−HT (t)}.
4. (a) For an Exponential(β) distribution
fX(t) =1
βe−t/β
hT (t) =
1βe−x/β
e−x/β=
1
β
(b) For a Weibull distribution,
fT (t) =γ
βtγ−1e−t
γ/β
FT (t) = P (X1/γ ≤ t) = P (X ≤ tγ) = 1− e−tγ/β
hT (t) =γ
βtγ−1
(c) For the logistic distribution,
FT (t) =1
1 + e−(t−µ)/β
fT (t) =1
(1 + e−(t−µ)/β)21
βe−(t−µ)/β
=1
βF (t)(1− F (t))
So h(t) = 1βF (t).
5. (a) The normal family can be written as
f(x|µ, σ) = 1√2πσ
exp
{−(x− µ)
2
2σ2
}
=1√2πσ
e−µ2/(2σ2)︸ ︷︷ ︸
c(θ)
exp
−x2︸︷︷︸t1(x)1
2σ2︸︷︷︸w1(θ)
+ x︸︷︷︸t2(x)
µ
σ2︸︷︷︸w2(θ)
1︸︷︷︸h(x)31
Statistics STAT:5100 (22S:193), Fall 2015 Tierney
(b) The Gamma family with both parameters unknown can be written as
f(x|α, β) = 1Γ(α)βα
xα−1e−x/β1(0,∞)(x)
=1
Γ(α)βα︸ ︷︷ ︸c(θ)
exp
(α− 1)︸ ︷︷ ︸w1(θ) log x︸︷︷︸t1(x) + (−x)︸ ︷︷ ︸t2(x)1
β︸︷︷︸w2(θ)
1(0,∞)(x)︸ ︷︷ ︸h(x)If α is known then the first term in the exponent becomes part of h(x); ifβ is known the second term in the exponent becomes part of h(x).
(c) The Beta family with both parameters unknown can be written as
f(x|α, β) = Γ(α + β)Γ(α)Γ(β)
xα−1(1− x)β−11[0,1](x)
=Γ(α + β)
Γ(α)Γ(β)︸ ︷︷ ︸c(θ)
exp
(α− 1)︸ ︷︷ ︸w1(θ)
log x︸︷︷︸t1(x)
+ (β − 1)︸ ︷︷ ︸w2(θ)
log(1− x)︸ ︷︷ ︸t2(x)
1[0,1](x)︸ ︷︷ ︸h(x)
Again, if either α or β is known the corresponding term in the exponentbecomes part of h(x).
(d) The Poisson family can be written as
f(x|λ) = λx
x!e−λ =
1
x!︸︷︷︸h(x)
e−λ︸︷︷︸c(λ)
exp
x︸︷︷︸t(x)
log λ︸︷︷︸w(λ)
(e) The negative binomial family with r known can be written as
f(x|r, p) =(r + x− 1
x
)pr(1− p)x
=
(r + x− 1
x
)︸ ︷︷ ︸
h(x)
pr︸︷︷︸c(p)
exp
x︸︷︷︸t(x)
log(1− p)︸ ︷︷ ︸w(p)
6. (a) For the binomial, w(p) = log(p/(1 − p)), c(p) = (1 − p)n, and t(x) = x.
The variance Var(t(X)) = Var(x) satisfies
(w′(p))2Var(X) = − d2
dp2log c(p)− w′′(p)E[X]
32
Statistics STAT:5100 (22S:193), Fall 2015 Tierney
Now
w′(p) =1
p+
1
1− p=
1
p(1− p)
w′′(p) = − 1p2
+1
(1− p)2d2
dp2log c(p) = − n
(1− p)2
So (1
p(1− p)
)2Var(X) =
n
(1− p)2−(− 1p2
+1
(1− p)2
)np
= n
(1
1− p+
1
p
)=
n
p(1− p)
and thus Var(X) = np(1− p)(b) For the Beta distribution t1(x) = log x and t2(x) = log(1−x). The function
f(x) = x cannot be expressed as a linear combination of t1(x) and t2(x),so the identities in Theorem 3.4.2 cannot be used to find the mean andvariance of X.
If X ∼ Poisson(λ) then t(x) = x, w(λ) = log λ, and c(λ) = e−λ. So
w′(λ) = 1/λ w′′(λ) = −1/λ2
∂
∂λlog c(λ) = −1 ∂
2
∂λ2log c(λ) = 0
So Theorem 3.4.2 produces the equations
E[X/λ] = 1
Var(X/λ) = E[X/λ2]
with solutions E[X] = λ and Var(X) = λ.
33
Statistics STAT:5100 (22S:193), Fall 2015 Tierney
Assignment 9
Due on Monday, October 26, 2015.
1. Problem 4.1 from the textbook
2. Problem 4.4 from the textbook
3. Problem 4.5 from the textbook
4. Problem 4.10 from the textbook
5. Let X1, X2, and V be independent random variables with
E[X1] = µ E[X2] = µ E[V ] = 0
Var(X1) = σ2 Var(X2) = σ
2 Var(V ) = τ 2.
Let Y1 = X1 + V and Y2 = X2 + V
(a) Find the means and variances of Y1 and Y1.
(b) Find Cov(Y1, Y2) and Cov(Y1, V ).
6. In the generalized birthday problem discussed in Week 2 an urn contains m ballsand a sample of size n is drawn from the urn with replacement. Let X be thenumber of balls that do not appear in the sample. Find the mean and varianceof X. [Hint: Express X as a sum of suitable Bernoulli random variables.]
Solutions
1. The joint density is
f(x, y) =
{14−1 ≤ x, y ≤ 1
0 otherwise
(a) Since the unit circle is contained in the supporting square,
P (X2 + Y 2 < 1) =area of circle
total area of square
=π
4
(b) The line y = 2x splits the support into two parts of equal area, so P (2X−Y > 0) = 1/2. Alternatively,
P (Y < 2X) =
∫ 1−1
∫ 1y/2
1
4dxdy =
∫ 1−1
1
4(1− y/2)dy
=
[y
4− y
2
16
]1−1
=1
4+
1
4=
1
2
34
Statistics STAT:5100 (22S:193), Fall 2015 Tierney
(c) All points in the interior of the square satisfy |x− y| < 2, so
P (|X + Y | < 2) = 1.
2. (a) The integral of the density is
1 =
∫ 10
∫ 20
C(x+ 2y)dxdy =
∫ 10
C(2 + 4y)dy
= C(2 + 2) = 4C
So the normalizing constant is C = 1/4.
(b) The marginal density of X is
fX(x) =
∫ 10
1
4(x+ 2y)dy1[0,2](x)
=1
4(x+ 1)1[0,2](x)
(c) The joint CDF is
F (x, y) =
∫ y0
∫ x0
1
4(u+ 2v)dudv
=
∫ y0
1
4
(x2
2+ 2vx
)dv
=1
4
(x2
2y + y2x
)=
1
8x2y +
1
4y2x
for 0 < x < 2 and 0 < y < 1.
(d) Since Z depends only on X this is a one-dimensional transformation. Thetransformation is smooth and monotone, so
Z = 9/(X + 1)2
X = 3/√Z − 1
dx =3
2
1
z3/2dz
fZ(z) =1
4
3√z
3
2
1
z3/2=
9
8z2
for z ∈ [1, 9].
3. (a)
P (X >√Y ) = P (Y < X2) =
∫ 10
∫ x20
x+ ydydx
=
∫ 10
x3 +x4
2dx =
1
4+
1
10=
7
20= 0.35
35
Statistics STAT:5100 (22S:193), Fall 2015 Tierney
(b)
P (X2 < Y < X) =
∫ 10
∫ xx2
2xdydx
=
∫ 10
2x(x− x2)dx =∫ 10
2x2 − 2x3dx
=2
3− 2
4=
1
6
4. (a) The marginal probabilities fX(2) and fY (3) are non-zero and thereforefX(2)fY (3) is non-zero, but the joint probability fX,Y (2, 3) = 0. So thejoint PMF is not the product of the marginals and thusX, Y are dependent.
(b) The marginals are
1 2 3fX(x)
14
12
14
and
2 3 4fY (y)
13
13
13
The joint probability table
X1 2 3
2 112
16
112
Y 3 112
16
112
4 112
16
112
obtained as gX,Y (x, y) = fXxfY (y) has the same marginals and X, Y areindependent.
5. (a) The means are
E[Xi] = E[Yi + V ] = E[Yi] + E[V ] = µ.
Since the Xi are independent of V the variances are
Var(Yi) = Var(Xi + V ) = Var(Xi) + Var(V ) = σ2 + τ 2.
(b) The covariance of Y1 and Y2 is
Cov(Y1, Y2) = Cov(X1 + V,X2 + V )
= Cov(X1, X2 + V ) + Cov(V,X2 + V )
= Cov(X1, X2) + Cov(X1, V ) + Cov(V,X2) + Cov(V, V )
= 0 + 0 + 0 + Var(V ) = τ 2.
36
Statistics STAT:5100 (22S:193), Fall 2015 Tierney
The covariance of Yi and V is
Cov(Yi, V ) = Cov(Xi + V, V )
= Cov(Xi, V ) + Cov(V, V )
= 0 + Var(V ) = τ 2.
6. Let Yi = 1 if ball i is not in the sample and Yi = 0 otherwise. Then X =Y1 + · · ·+ Ym. The Yi are Bernoulli random variables with success probability
pm =
(m− 1m
)n.
So the mean of X is
E[X] = mpm = m
(m− 1m
)n.
The Yi are correlated, so to calculate Var(X) we need their covariances. Nowfor i 6= j
E[YiYj] = E[Y1Y2] = P (balls 1 and 2 are not in the sample) =
(m− 2m
)n.
So
Cov(Yi, Yj) =
(m− 2m
)n− p2m =
(m− 2m
)n−(m− 1m
)2n.
The variance of X is therefore
Var(X) =∑
Var(Yi) +∑∑
i 6=j
Cov(Yi, Yj)
= mpm(1− pm) +m(m− 1)((
m− 2m
)n− p2m
).
37
Statistics STAT:5100 (22S:193), Fall 2015 Tierney
Assignment 10
Due on Monday, November 2, 2015.
1. Problem 4.15 from the textbook
2. Problem 4.16 (a) and (c) from the textbook (these geometrics count failures!)
3. Problem 4.17 from the textbook
4. Problem 4.21 from the textbook
5. Problem 4.27 from the textbook
6. Problem 4.36 from the textbook
Solutions
1. X, Y are independent,
X ∼ Poisson(θ) MX(t) = exp{θ(et − 1)}Y ∼ Poisson(λ) MY (t) = exp{λ(et − 1)}
soMX+Y (t) = exp{(θ + λ)(et − 1)}
and thus X + Y is Poisson(θ + λ). Now
fX|X+Y (x|z) =fX(x)fY (z − x)
fX+Y (z)=
θx
x!e−θ λ
z−x
(z−x)!e−λ
(θ+λ)z
z!e−(θ+λ)
=z!
x!(z − x)!
(θ
θ + λ
)x(1− θ
θ + λ
)z−xfor 0 ≤ x ≤ z. So X|X + Y = z is Binomial(z, θ/(θ + λ)).
2. X, Y are geometric, starting at zero (counting failures),
f(x) = p(1− p)x
for x = 0, 1, 2, . . ., and are independent.
(a)
U = min(X, Y ) range: 0, 1, . . .
V = X − Y range: all integers
38
Statistics STAT:5100 (22S:193), Fall 2015 Tierney
The joint PMF of U, V is
fU,V (u, v) = P (min(X, Y ) = u,X − Y = v)
=
{P (Y = u,X = u+ v) if v ≥ 0P (X = u, Y = u− v) if v < 0
=
{p2(1− p)u(1− p)u+v v ≥ 0p2(1− p)u(1− p)u−v v < 0
=[p2(1− p)2u
] [(1− p)|v|
]So U, V are independent.
(b) Z takes on all possible rational values in [0,1]. Let q be a rational in [0,1]and let q = m
nwhere m and n have no common factors. Then for m > 0
P (Z = q) = P (Z =m
n)
= P (X = mk, Y = (n−m)k for some k ≥ 1)
=∞∑k=1
p2(1− p)mk(1− p)(n−m)k =∞∑k=1
p2(1− p)nk
=p2(1− p)n
1− (1− p)n
For m = 0, P (Z = 0) = P (X = 0) = p.
c. Let Z = X + Y . For x = 0, 1, . . . and z = x, x + 1, . . . the joint PMF ofX and Z is
f(x, z) = P (X = x, Y = z − x) = p(1− p)xp(1− p)z−x = p2(1− p)z
For all other (x, y) pairs f(x, y) = 0.
3. (a) For y = 1, 2, . . .,
fY (y) = P (y − 1 < X < y) = e−(y−1) − e−y = e−(y−1)(1− e−1)
So Y is geometric(p = 1− e−1).(b)
P (X − 4 > x|Y ≥ 5) = P (X − 4 > x|X ≥ 4)
=
{1 x ≤ 0e−x−4/e−4 x > 0
=
{1 x ≤ 0e−x x > 0
This is an exponential distribution.
For any t, X − t|X ≥ t is Exponential(1).
39
Statistics STAT:5100 (22S:193), Fall 2015 Tierney
4. R2 ∼ χ22 = Gamma(1, 2) = Exponential(2) and θ ∼ Uniform(0, 2π).
X =√R2 cos θ
Y =√R2 sin θ
A = (0,∞)× (0, 2π)B = R2
The joint density of R2, θ is
fR2,θ(a, b) =1
2e−a/2
1
2π
for (a, b) ∈ A. The inverse transformation is
R2 = X2 + Y 2
θ =
cos−1(
X√X2+Y 2
)Y > 0
2π − cos−1(
X√X2+Y 2
)Y < 0
This is messy to differentiate; instead, compute
J−1 = det
(1
2√R2
cos θ −√R2 sin θ
1
2√R2
sin θ√R2 cos θ
)=
1
2cos2 θ +
1
2sin2 θ =
1
2
So J = 2, and
fX,Y (x, y) =1
2πe−
x2+y2
2
Thus X, Y are independent standard normal variables.
5. Approach from class: Let Z1, Z2 be independent standard normals and let
X = µ+ σZ1
Y = γ + σZ2
Then
U = X + Y = µ+ γ + σZ1 + σZ2
V = X − Y = µ− γ + σZ1 − σZ2
So
C = BBT =
[2σ2 00 2σ2
]
40
Statistics STAT:5100 (22S:193), Fall 2015 Tierney
and
fU,V (u, v) =1
2π2σ2exp
{−(u− (µ+ γ))
2
4σ2− (v − (µ− γ))
2
4σ2
}= fU(u)fV (v)
where U ∼ N(µ+ γ, 2σ2), V ∼ N(µ− γ, 2σ2), and U, V are independent.
6. (a) See (b)
(b) Suppose the Pi are independent random variables with values in the unitinterval and common mean µ. Since the Pi are independent and each Xionly depends on Pi, the Xi are marginally independent as well. Each Xitakes on only the values 0 and 1, so the marginal distributions of the Xiare Bernoulli with success probability
P (Xi = 1) = E[P (Xi = 1|Pi)] = E[Pi] = µ
So the Xi are independent Bernoulli(µ) random variables and thereforeY =
∑ni=1Xi is Binomial(n, µ). If the Pi have a Beta(α,β) distribution
then µ = α/(α + β) and therefore
E[Y ] = nµ = nα/(α + β)
Var(Y ) = nµ(1− µ) = nαβ/(α + β)2
(c) For each i = 1, . . . , k
E[Xi] = E[E[Xi|Pi]] = E[niPi] = niE[Pi] = niα
α + β
Var(Xi) = E[Var(Xi|Pi)] + Var(E[Xi|Pi])= E[niPi(1− Pi)] + Var(niPi)= niE[Pi(1− Pi)] + n2iVar(Pi)
= ni
∫ 10
Γ(α + β)
Γ(α)Γ(β)pα+1−1(1− p)β+1−1dp+ n2i
αβ
(α + β)2(α + β + 1)
= niΓ(α + β)
Γ(α)Γ(β)
Γ(α + 1)Γ(β + 1)
Γ(α + β + 2)+
n2iαβ
(α + β)2(α + β + 1)
=niαβ
(α + β)(α + β + 1)+
n2iαβ
(α + β)2(α + β + 1)
=niαβ
(α + β)(α + β + 1)
(1 +
niα + β
)= ni
αβ(α + β + ni)
(α + β)2(α + β + 2)
41
Statistics STAT:5100 (22S:193), Fall 2015 Tierney
Again the Xi are marginally independent, so
E[Y ] =∑
E[Xi] =α
α + β
k∑i=1
ni
Var(Y ) =∑
Var(Xi) =k∑i=1
niαβ(α + β + ni)
(α + β)2(α + β + 2)
The marginal distribution of Xi is called a beta-binomial distribution. Thedensity of Pi is
fP (p) =Γ(α + β)
Γ(α)Γ(β)pα−1(1− p)β−1
for 0 < p < 1. So the PMF of Xi is
P (Xi = x) = E[P (Xi = x|Pi)] = E[(nix
)P xi (1− Pi)ni−x
]=
∫ 10
(nix
)px(1− p)ni−x Γ(α + β)
Γ(α)Γ(β)pα−1(1− p)β−1dp
=
(nix
)Γ(α + β)
Γ(α)Γ(β)
Γ(α + x)Γ(β + ni − x)Γ(α + β + ni)
42
Statistics STAT:5100 (22S:193), Fall 2015 Tierney
Assignment 11
Due on Monday, November 9, 2015.
1. Problem 4.28 (a) and (b) from the textbook
2. Problem 4.30 from the textbook
3. Problem 4.39 from the textbook
4. Problem 4.40 from the textbook
Solutions
1. (a)
U =X
X + YA = R2
V = X + Y B = R2
X = UV
Y = V − UV = (1− U)V
So
|J(u, v)| =∣∣∣∣det( v u−v 1− u
)∣∣∣∣ = |v(1− u) + uv| = |v|Thus
fU,V (u, v) = fX,Y (uv, (1− u)v)|v| =1
2πe−
12u2v2− 1
2(1−u)2v2 |v|
and
fU(u) =
∫fU,V (u, v)dv
=
∫ ∞−∞
1
2πe−
12u2v2− 1
2(1−u)2v2|v|dv
= 2
∫ ∞0
1
2πexp
{−v
2
2(1 + 2u2 − 2u)
}vdv
=1
π(1 + 2u2 − 2u)=
1
π(12
+ 2(u− 1/2)2)=
2
π(1 + 4(u− 1/2)2)
This is a Cauchy(1/2,1/2) density.
43
Statistics STAT:5100 (22S:193), Fall 2015 Tierney
(b)
U = X/|Y |V = Y
X = U |V |Y = V
with A = B = R2. So
|J(u, v)| =∣∣∣∣det(|v| ±u0 1
)∣∣∣∣ = |v|and
fU,V (u, v) =1
2π|v| exp
{−1
2u2v2 − 1
2v2}
fU(u) =1
π(1− u2)
2. (a) The mean of Y is
E[Y ] = E[E[Y |X]] = E[X] = 1/2
The variance is
Var(Y ) = E[Var(Y |X)] + Var(E[Y |X]) = E[X2] + Var(X)= 1/3 + 1/12 = 5/12
The covariance is
Cov(X, Y ) = E[(Y − µY )(X − µX)] = E[E[Y − µY |X](X − µX)]= E[(X − µX)2] = Var(X) = 1/12
(b) The conditional distribution of Z = Y/X, given X = x is N(1, 1). Sincethis conditional distribution does not depend on x, Z and X are indepen-dent.
3. For each j, Xj counts the number of m independent trials that fall in categoryj. It therefore has a Binomial(m, pj) distribution.
Let Y = m−Xi−Xj. Then by a similar argument the joint marginal distribution
44
Statistics STAT:5100 (22S:193), Fall 2015 Tierney
of (Xi, Xj, Y ) is Multinomial(m, pi, pj, 1− p1 − pj). So
P (X = xi|Xj = xj) =P (Xi = xi, Xj = xj)
P (Xj = xj)
=P (Xi = xi, Xj = xj, Y = m− xi − xj)
P (Xj = xj)
=
m!xi!xj !(m−xi−xj)!p
xii p
xjj (1− pi − pj)m−x1−xj
m!xj !(m−xj)!p
xjj (1− pj)m−xj
=(m− xj)!
xi!(m− xi − xj)!pxii (1− pi − pj)m−xi−xj
(1− pj)m−xj
=
(m− xjxi
)(pi
1− pj
)xi (1− pi
1− pj
)m−xj−xifor xi = 0, . . . ,m − xj. This is the PMF of a Binomial(m − xj, pj/(1 − pj))distribution.
Using these results,
E[XiXj] = E[XjE[Xi|Xj]] = E[Xj(m−Xj)
pi1− pj
]= (m2pj − E[X2j ])
pi1− pj
= (m2pj − Var(Xj)− E[Xj]2)pi
1− pj= (m2pj −mpj(1− pj)−m2p2j)
pi1− pj
= (m2pj(1− pj)−mpj(1− pj))pi
1− pj= (m2 −m)pipj
and therefore
Cov(Xi, Xj) = E[XiXj]− E[Xi]E[Xj] = (m2 −m)pipj −m2pjpj = −mpipj
An alternative approach for deriving the covariance is to use indicator functionsof whether the k-th trial falls in category i.
4. (a)
(b) The marginal density of Y is
fX(x) =
∫ 1−x0
Cxa−1yb−1(1− x− y)c−1dy
= Cxa−1(1− x)b+c−1∫ 10
ub−1(1− u)c−1du
= Cxa−1(1− x)b+c−1Γ(b)Γ(c)Γ(b+ c)
45
Statistics STAT:5100 (22S:193), Fall 2015 Tierney
for 0 < x < 1 using the change of variables u = y/(1 − x). This is aBeta(a, b+ c) density, and
1 =
∫ 10
fX(x)dx = CΓ(b)Γ(c)
Γ(b+ c)
Γ(a)Γ(b+ c)
Γ(a+ b+ c)
= CΓ(a)Γ(b)Γ(c)
Γ(a+ b+ c).
So
C =Γ(a+ b+ c)
Γ(a)Γ(b)Γ(c).
Because of the symmetric role of x and y, the marginal distribution of Yis Beta(b, a+ c).
(c) The conditional distribution of Y |X = x has density
fY |X(y|x) =fX,Y (x, y)
fX(x)
∝ xa−1yb−1(1− x− y)c−1
xa−1(1− x)b+c−1
=
(y
1− x
)b−1(1− y
1− x
)c−11
1− x
for 0 < y < 1− x. The conditional density of U = Y/(1−X) given X = xis therefore
fU |X(u|x) ∝ ub−1(1− u)c−1
for 0 < u < 1, which is a Beta(b, c) density. As this does not depend on xthis shows that U and X are independent.
(d) The expected product is
E[XY ] = E[XE[Y |X]] = bb+ c
E[X(1−X)]
=b
b+ c
Γ(a+ b+ c)
Γ(a)Γ(b+ c)
∫ 10
xa(1− x)b+cdx
=b
b+ c
Γ(a+ b+ c)
Γ(a)Γ(b+ c)
Γ(a+ 1)Γ(b+ c+ 1)
Γ(a+ b+ c+ 2)
=b
b+ c
a(b+ c)
(a+ b+ c)(a+ b+ c+ 1)
=ab
(a+ b+ c)(a+ b+ c+ 1).
46
Statistics STAT:5100 (22S:193), Fall 2015 Tierney
The covariance is therefore
Cov(X, Y ) = E[XY ]− E[X]E[Y ]
=ab
(a+ b+ c)(a+ b+ c+ 1)− ab
(a+ b+ c)2
= − ab(a+ b+ c)2(a+ b+ c+ 1)
.
47
Statistics STAT:5100 (22S:193), Fall 2015 Tierney
Assignment 12
Due on Monday, November 16, 2015.
1. Problem 4.47 from the textbook
2. Problem 4.55 from the textbook
3. Problem 5.2 from the textbook
4. Problem 5.8 from the textbook. You can simplify calculations somewhat byarguing that you can assume without loss of generality that θ1 = E[Xi] = 0.
5. Problem 5.15 from the textbook.
6. Let U1, . . . , Un be a random sample from the Uniform[0, 1] distribution withorder statistics U(1) ≤ · · · ≤ U(n), and let R = U(n) − U(1) be the sample range.Find the marginal density of R.
Solutions
1. (a) For z < 0
P (Z ≤ z) = P (X ≤ z and Y < 0) + P (−X ≤ z and Y > 0).
Since X, Y are independent, continuous, and have distributions symmetricabout the origin,
P (X ≤ z and Y < 0) = P (X ≤ z)P (Y < 0) = Φ(z)12
and
P (−X ≤ z and Y > 0) = P (−X ≤ z)P (Y > 0) = Φ(z)12
where Φ(z) is the CDF of the standard normal distribution. So
P (Z ≤ z) = Φ(z)12
+ Φ(z)1
2= Φ(z)
By symmetry, for z > 0
P (Z ≥ z) = P (Z ≤ −z) = Φ(−z) = 1− Φ(z)
and therefore the CDF of X is equal to Φ for all z and Z has a standardnormal distribution.
48
Statistics STAT:5100 (22S:193), Fall 2015 Tierney
(b) If Y > 0 then Z = |X| and Z > 0. Similarly, if Y < 0 then Z = −|X| andZ < 0. So Z and Y have the same sign and therefore the joint distributionof Z, Y assigns zero probability to the second and fourth quadrants. SinceX, Y are jointly continuous and the bivariate normal distribution assignspositive probability to all open sets this means Z, Y cannot be jointlynormal.
2. Let L be the system lifetime and let X1, X2, X3 be the component lifetimes.Then
P (L ≤ x) = P (all three components fail by x)= P (X1 ≤ x,X2 ≤ x,X3 ≤ x) = P (X1 ≤ x)P (X2 ≤ x)P (X3 ≤ x)= P (X1 ≤ x)3 = (1− e−x/λ)3
for y > 0.
3. (a) Condition on X1 = x:
P (Y > y|X1 = x) =
{1 if y ≤ 0F (x)y if y ≥ 1
i.e. Y |X1 = x is geometric with p = 1− F (x). So for y ≥ 1
P (Y > y) = E[F (X1)y] =
∫ 10
uydu
=1
y + 1
since F (X1) is uniform on [0, 1] by the probability integral transform. Sofor y = 1, 2, . . . ,
P (Y = y) =1
y− 1y + 1
=1
y(y + 1)
Alternative argument: For y = 1, 2, . . . ,
P (Y > y) = P (X1 > max{X2, . . . , Xy+1})= P (argmax(X1, . . . , Xy+1) = 1)
=1
y + 1
by symmetry.
49
Statistics STAT:5100 (22S:193), Fall 2015 Tierney
(b) Using byc to denote the largest integer less than or equal to y we haveP (Y > y) = 1− FY (y) = 1/(byc+ 1) for all y ≥ 0. So
E[Y ] =
∫ ∞0
1− FY (y)dy =∫ ∞0
1
byc+ 1dy
≥∫ ∞0
1
y + 1dy =∞
4. (a) ∑(Xi −X)2 =
∑X2i −
1
n
(∑Xi
)(∑Xj
)=
1
2n
(2∑i
∑j
X2i − 2∑i
∑j
XiXj
)
=1
2n
(∑i
∑j
X2i − 2∑i
∑j
XiXj +∑i
∑j
X2j
)
=1
2n
∑i
∑j
(Xi −Xj)2
(b) Assume, without loss of generality, that E[Xi] = θ1 = 0. Then
E[S2] = σ2 = θ2
and
E[S4] =1
4n2(n− 1)2∑i
∑j
∑k
∑`
E[(Xi −Xj)2(Xk −X`)2]
If i = j or k = `, then E[(Xi − Xj)2(Xk − X`)2] = 0. If all i, j, k, ` aredifferent, then
E[(Xi −Xj)2(Xk −X`)2] = E[(X1 −X2)2]2 = (2σ2)2 = 4θ22
If {i, j} ∩ {k, `} = {i}, say k = i, then
E[(Xi −Xj)2(Xi −X`)2] = E[(X2i − 2XiXj +X2j )(X2i − 2XiX` +X2` )]= E[X4i − 2X3iXj +X2iX2j
− 2X3iX` + 4X2iXjX` − 2X2jXiX`+X2iX
2` − 2XiXjX2` +X2jX2` ]
= θ4 + 3θ22
50
Statistics STAT:5100 (22S:193), Fall 2015 Tierney
If {i, j} = {k, `}, then
E[(Xi −Xj)2(Xi −X`)2] = E[(Xi −Xj)4]= E[X4i − 4X3iXj + 6X2iX2j − 4XiX3j +X4j ]= 2θ4 + 6θ
22
So
E[S4] =1
4n2(n− 1)2[n(n− 1)(n− 2)(n− 3)4θ22+ 4n(n− 1)(n− 2)(θ4 + 3θ22)+ 2n(n− 1)(2θ4 + 6θ22)]
=1
4n(n− 1)[4(n− 2)(n− 3)θ22 + 4(n− 2)(θ4 + 3θ22) + 4(θ4 + 3θ22)
]=
1
n(n− 1)[(n− 1)θ4 + ((n− 2)(n− 3) + 3(n− 2) + 3)θ22]
=1
n(n− 1)[(n− 1)θ4 + (n2 − 2n+ 3)θ22]
So
Var(S2) = E[S4]− n(n− 1)n(n− 1)
θ22
=1
n(n− 1)[(n− 1)θ4 + (n2 − 2n+ 3− n2 + n)θ22]
=1
n(n− 1)[(n− 1)θ4 − (n− 3)θ22]
=1
n
[θ4 −
n− 3n− 1
θ22
](c) Still assume θ1 = 0.
E[XS2] =1
2n2(n− 1)∑i
∑j
∑k
E[(Xi −Xj)2Xk]
=1
2n2(n− 1)2n(n− 1)E[(X1 −X2)2X1]
=1
nE[X31 − 2X21X2 +X1X22 ]
=1
nE[X31 ] =
1
nθ3
So X and S2 are uncorrelated if and only if θ3 = 0.
51
Statistics STAT:5100 (22S:193), Fall 2015 Tierney
5. (a) For the mean,
Xn+1 =1
n+ 1
n+1∑i=1
Xi
=1
n+ 1
n∑i=1
Xi +1
n+ 1Xn+1
=n
n+ 1Xn +
1
n+ 1Xn+1
(b) For the variance,
nS2n+1 =n+1∑i=1
(Xi −Xn+1)2
=n+1∑i=1
(Xi −Xn)2 − (n+ 1)(Xn+1 −Xn)2
=n∑i=1
(Xi −Xn)2 + (Xn+1 −Xn)2 − (n+ 1)(Xn+1 −Xn)2
= (n− 1)S2n + (Xn+1 −Xn)2 − (n+ 1)(Xn+1 −Xn)2
From the result for sample means
Xn+1 −Xn =1
n+ 1(Xn+1 −Xn)
and therefore
(Xn+1 −Xn)2 − (n+ 1)(Xn+1 −Xn)2 = (Xn+1 −Xn)2 −1
n+ 1(Xn+1 −Xn)2
=n
n+ 1(Xn+1 −Xn)2
which completes the proof.
6. To simplify notation let X = U(n) and Y = U(1). From the general form of thejoint density of two order statistics the joint density of X and Y is
fXY (x, y) =
{n(n− 1)(x− y)n−2 for 0 < y < x < 10 otherwise/
Let R = X − Y and V = Y . This is a one-to-one transformation with inverse
x = r + v
y = v,
52
Statistics STAT:5100 (22S:193), Fall 2015 Tierney
Jacobian determinant
J(r, v) = det
(1 10 1
)= 1,
and range
B = {(r, v) : 0 < r < r + v} = {(r, v) : 0 < r < 1 and 0 < v < 1− r}.
The joint density of R and V is therefore
fRV (r, v) = fXY (r + v, v) =
{n(n− 1)rn−2 if 0 < r < 1 and 0 < v < 1− r0 otherwise,
and the marginal density of R is
fR(r) =
∫ 1−r0
n(n− 1)rn−2dv = n(n− 1)rn−2(1− r)
for 0 < r < 1 and zero otherwise. This is a Beta(n− 1, 2) density.
53
Statistics STAT:5100 (22S:193), Fall 2015 Tierney
Assignment 13
Due on Monday, November 30, 2015.
1. Problem 5.24 from the textbook
2. Problem 5.32 from the textbook
3. Problem 5.40 from the textbook
4. Let X have a Gamma(α, 1) distribution and let Y = (X − α)/√α.
(a) Find the density, mean, and variance of Y .
(b) Plot the density fα(y) of Y for α = 2, 10, 100.
(c) Show that for every y the density fα(y) converges to the standard normaldensity at y as α tends to infinity. Hints:
i. Use Stirling’s approximation for the gamma function, which can bewritten as
Γ(α) = e−ααα−1/2√
2π(1 +O(α−1)
)as α→∞.
ii. Work with log densities and use the fact that
log(1 + x) = x− x2
2+O(x3)
as x→ 0.
Solutions
1. Assume, without loss of generality, that θ = 1. Then for 0 < u < v < 1
fX(1),X(n)(u, v) =n!
0!(n− 2)!0!f(u)f(v)F (u)u(F (v)− F (u))n−2(1− F (v))0
= n(n− 1)f(u)f(v)(F (v)− F (u))n−2
= n(n− 1)(v − u)n−2
Let
Y = X(1)/X(n)
Z = X(n)
54
Statistics STAT:5100 (22S:193), Fall 2015 Tierney
The range of (Y, Z) is B = [0, 1]× [0, 1], the inverse transformation is
X(1) = Y Z
X(n) = Z
The Jacobian determinant is
J(y, z) = det
(z y0 1
)= z
So for 0 < z < 1 and 0 < y < 1
fY,Z(y, z) = n(n− 1)(z − yz)n−2z= nzn−1 × (n− 1)(1− y)n−2
Thus Y and Z are independent.
This proof first removes θ from consideration since it is just a scale parameter.An alternative approach is to note that X(n) is minimal sufficient, X(1)/X(n) isancillary, and use Basu’s theorem in chapter 6.
2. a. Suppose f is continuous at a and XnP→ a, a a constant. Fix ε > 0. Then
there exists a δ > 0 such that
|f(x)− f(a)| < ε
if |x− a| < δ. So
P (|f(Xn)− f(a)| < ε) ≥ P (|Xn − a| < δ)→ 1
So f(Xn)P→ f(a). The result follows if f(x) =
√x or f(x) = 1/x and
a > 0.
b. f(x) = σ/√x is continuous at x = σ2 if σ > 0.
3. a. For any t and any ε > 0, if Xn > t and |Xn −X| < ε, then X > t− ε. SoX ≤ t− ε implies that either Xn ≤ t or |Xn −X| ≥ ε, i.e.
{X ≤ t− ε} ⊂ {Xn ≤ t} ∪ {|Xn −X| ≥ ε}
SoP (X ≤ t− ε) ≤ P (Xn ≤ t) + P (|Xn −X| ≥ ε)
orP (X ≤ t− ε)− P (|Xn −X| ≥ ε) ≤ P (Xn ≤ t)
b. Similarly (reversing the roles of X and Xn and replacing t by t+ ε),
P (Xn ≤ t) ≤ P (X ≤ t+ ε) + P (|Xn −X| ≥ ε)
55
Statistics STAT:5100 (22S:193), Fall 2015 Tierney
c. Suppose the CDF of X is continuous at t. From the previous two parts,since Xn → X in probability we have
P (X ≤ t− ε) ≤ lim infn→∞
P (Xn ≤ t) ≤ lim supn→∞
P (Xn ≤ t) ≤ P (X ≤ t+ ε)
for any ε > 0. Since t is a continuity point of the distribution of X,limε↓0 P (X ≤ t− ε) = limε↓0 P (X ≤ t+ ε) = P (X ≤ t), and therefore
P (X ≤ t) ≤ lim infn→∞
P (Xn ≤ t) ≤ lim supn→∞
P (Xn ≤ t) ≤ P (X ≤ t)
So limn→∞ P (Xn ≤ t) exists and is equal to P (X ≤ t). Thus Xn → X indistribution.
4. (a) The mean and variance of X are E[X] = α and Var(X) = α, so
E[Y ] = (E[X]− α)/√α = 0
Var(Y ) = Var(X)/α = 1.
The inverse transformation is x =√αy + α with derivative dx/dy =
√α,
so the density of Y is
fY (y) =√αfX(
√αy + α) =
√α
Γ(α)(√αy + α)α−1e−
√αy−α.
(b) The following R code is one way to produce the plots:
z
Statistics STAT:5100 (22S:193), Fall 2015 Tierney
The logarithm of the remainder of the density is
(α− 1) log(
y√α
+ 1
)−√αy = (α− 1)
[y√α− y
2
2α+O(α−3/2)
]−√αy
= − y√α− α− 1
α
y2
2+O(α−1/2)
→ −y2
2
as α→∞. So fY (y) converges pointwise to a standard normal density.
57
Statistics STAT:5100 (22S:193), Fall 2015 Tierney
Assignment 14
Due on Monday, December 7, 2015.
1. Let Xn have a χ2n distribution.
(a) Find an approximating normal distribution for Xn.
(b) Find an approximating normal distribution for Yn =√Xn.
(c) Find an approximating normal distribution for Zn = logXn.
(d) How good are the approximations in parts (a), (b), and (c) for n =5, 10, 20, 100?
2. The height H and radius R of a cylinder are measured with error; the measure-ments are independent, normally distributed, and
µH = 75cm σH = 2cm µR = 10cm σR = 1cm.
The estimated volume of the cylinder is V = πR2H. Find a normal approxi-mation to the distribution of V .
3. Let X1, . . . , Xn be a random sample from an exponential distribution with meanθ.
(a) Find a normal approximation to the distribution of the sample mean Xn.
(b) Let Yn = g(Xn) where g is differentiable. Find a normal approximation tothe distribution of Yn.
(c) Can you find a function g such that the variance of the normal approxi-mation in (b) does not depend on θ?
4. Let X1, . . . , Xn he a random sample from a Poisson distribution with mean
λ > 0. Let Xn be the sample average and let Un =√n(Xn − λ)/
√Xn. Find
the limiting distribution of Un as n tends to infinity.
Solutions
1. (a) A χ2n random variable Xn has the same distribution as∑n
i=1 Ui with theUi i.i.d. χ
21 random variables. So the central limit theorem states that
Xn ∼ AN(n, 2n)
as n→∞.
58
Statistics STAT:5100 (22S:193), Fall 2015 Tierney
(b) Let Vn =√Xn/n = f(Xn/n). Now Xn/n
P→ 1, Xn/n ∼ AN(1, 2/n),f(1) = 1, and f ′(1) = 1/2. So
Vn ∼ AN(
1,1
42/n
)= AN
(1,
1
2n
)and thus
Yn =√nVn ∼ AN(
√n, 1/2)
(c) Let Wn = log(Xn/n) = f(Xn/n). Now Xn/nP→ 1, Xn/n ∼ AN(1, 2/n),
f(1) = 0, and f ′(1) = 1. So
Wn ∼ AN (0, 2/n)
and thusZn = Vn + log n ∼ AN(log n, 2/n)
(d) The exact CDFs and PDFs of Yn and Zn are
FYn(y) = FXn(y2) fYn(y) = fXn(y
2)2y
FZn(z) = FXn(ez) fZn(z) = fXn(e
z)ez
You can look at graphs of the densities or CDFs, or at quantile plots, orat numerical measures of the discrepancies of the exact and approximateCDFs.
2. V = f(R,H) with f(r, h) = πr2h. The gradient of f is
∇f(r, h) = (2πrh, πr2).
So
V ≈ f(10, 75) + ∂∂rf(10, 75)(R− 10) + ∂
∂hf(10, 75)(H − 75)
= π × 7500 + π × 1500× (R− 10) + π × 100× (H − 75)∼ N(π × 7500, (π × 1500× 1)2 + (π × 100× 2)2)≈ N(23561.94, (4754.09)2)
3. (a) The variance of the exponential distribution with mean θ is θ2. So by theCLT Xn ∼ N(θ, θ2/n).
(b) By the delta method Yn = g(Xn) ∼ N(g(θ), (g′(θ))2θ2/n).(c) The approximate variance is constant in θ if g′(θ) = 1/θ, or g(θ) = log θ.
This is an example of a variance stabilizing transformation.
59
Statistics STAT:5100 (22S:193), Fall 2015 Tierney
4. By the weak law of large numbers Xn converges in probability to E[X1] = λ.
By the continuous mapping theorem Tn =√Xm converges in probability to√
λ. Using the strong law of large numbers and basic continuity shows thatconvergence also holds almost surely.
Since Var(X1) = λ the central limit theorem implies that Xn ∼ AN(λ, λ/n).Since the square root function f(x) =
√x is differentiable at positive x the delta
method implies that
Tn ∼ AN(f(λ), (f ′(λ)2λ/n) = AN
(√λ,
(1
2√λ
)2λ/n
)= AN
(√λ,
1
4n
).
60
Assignment 1SolutionsAssignment 2SolutionsAssignment 3SolutionsAssignment 4SolutionsAssignment 5SolutionsAssignment 6SolutionsAssignment 7SolutionsAssignment 8SolutionsAssignment 9SolutionsAssignment 10SolutionsAssignment 11SolutionsAssignment 12SolutionsAssignment 13SolutionsAssignment 14Solutions