4
More on the Neyman-Pearson Lemma This handout explores some additional applications of the Neyman-Pearson Lemma, and introduces a related result based on monotone likelihood ratios known as the Karlin-Rubin Theorem. Applying the Neyman-Pearson Lemma : Let lr 10 (x)= f 1 (x) f 0 (x) be the ratio of the prob- ability density functions under H 1 : θ = θ 1 and H 0 : θ = θ 0 where both are simple hypotheses. If k is the solution to: Pr(X C )= Pr[lr 10 (X) >k|H 0 ]= α, then C = {x : lr 10 (x) >k} is the size-α critical region of the uniformly most powerful (UMP) test for H 0 vs. H a by the Neyman-Pearson Lemma. The difficult part is finding the value of k for which this is true. Example : Suppose X 1 ,...,X n are a random sample with common density: f X i (x i |θ) = exp {a(θ)b(x i )+ c(θ)+ d(x i )} , and we wish to test the hypotheses H 0 : θ = θ 0 vs. H a : θ = θ 1 . The ratio of densities is: lr 10 (x)= n i=1 exp {a(θ 1 )b(x i )+ c(θ 1 )+ d(x i )} exp {a(θ 0 )b(x i )+ c(θ 0 )+ d(x i )} = n i=1 exp {[a(θ 1 ) a(θ 0 )] b(x i )+ c(θ 1 ) c(θ 0 )} . This ratio is large (as a function of x) if and only if n i=1 [a(θ 1 ) a(θ 0 )] b(x i )=[a(θ 1 ) a(θ 0 )] n i=1 b(x i ) is large (i.e.: if it’s greater than some k 2 ). Without loss of generality, take a(θ 1 ) a(θ 0 ) > 0. Then this is equivalent to saying we need T = n i=1 b(X i ) k 3 . In some cases, the distribution of T is known and finding k 2 is straightforward. Example : Suppose X i iid exp(θ) so that: f X i (x i |θ)= 1 θ exp{−x i } = exp { 1 θ x i log θ } ,x i > 0,θ> 0. Consider T = b(X i )= X i . What is the distribution of T ? So the UMP test of H 0 : θ = θ 0 vs. H 1 : θ = θ 1 (θ 1 0 ) rejects H 0 if T = X i >k 2 where k 2 can be found from a χ 2 -distribution. 99

neypears2.pdf

Embed Size (px)

Citation preview

Page 1: neypears2.pdf

More on the Neyman-Pearson Lemma

This handout explores some additional applications of the Neyman-Pearson Lemma, and

introduces a related result based on monotone likelihood ratios known as the Karlin-Rubin

Theorem.

Applying the Neyman-Pearson Lemma: Let lr10(x) =f1(x)

f0(x)be the ratio of the prob-

ability density functions under H1: θ = θ1 and H0: θ = θ0 where both are simple hypotheses.

If k is the solution to:

Pr(X ∈ C) = Pr[lr10(X) > k|H0] = α,

then C = x : lr10(x) > k is the size-α critical region of the uniformly most powerful

(UMP) test for H0 vs. Ha by the Neyman-Pearson Lemma.

• The difficult part is finding the value of k for which this is true.

Example: Suppose X1, . . . , Xn are a random sample with common density:

fXi(xi|θ) = exp a(θ)b(xi) + c(θ) + d(xi) ,

and we wish to test the hypotheses H0: θ = θ0 vs. Ha: θ = θ1. The ratio of densities is:

lr10(x) =n∏

i=1

exp a(θ1)b(xi) + c(θ1) + d(xi)exp a(θ0)b(xi) + c(θ0) + d(xi)

=n∏

i=1

exp [a(θ1)− a(θ0)] b(xi) + c(θ1)− c(θ0) .

• This ratio is large (as a function of x) if and only if

n∑i=1

[a(θ1)− a(θ0)] b(xi) = [a(θ1)− a(θ0)]n∑

i=1

b(xi)

is large (i.e.: if it’s greater than some k2).

• Without loss of generality, take a(θ1)− a(θ0) > 0. Then this is equivalent to saying we

need T =n∑

i=1

b(Xi) ≥ k3.

• In some cases, the distribution of T is known and finding k2 is straightforward.

Example: Suppose Xiiid∼ exp(θ) so that:

fXi(xi|θ) =

1

θexp−xi/θ = exp

−1

θxi − log θ

, xi > 0, θ > 0.

Consider T =∑

b(Xi) =∑

Xi. What is the distribution of T?

So the UMP test of H0: θ = θ0 vs. H1: θ = θ1 (θ1 > θ0) rejects H0 if T =∑

Xi > k2where k2 can be found from a χ2-distribution.

99

Page 2: neypears2.pdf

• In other cases, the distribution of T cannot be found analytically and a normal approx-

imation can be employed as:T − nEb(Xi)|θ0√

nVarb(Xi)|θ0•∼ N(0, 1).

Then k2 ≈ nEb(Xi)|θ0+ zα√nVarb(Xi)|θ0 (where: α = Pr(Z ≥ zα)).

• In most other cases, Monte Carlo simulation can be used to find k2.

A Randomized Critical Function (for cases where lr10 is not continuous underH0): Sup-

pose X ∼ Poisson(θ) and we test:

H0: θ = 1 vs. H1: θ = θ1, θ1 > 1,

based on a single observation from X. The critical ratio is:

l10(x) =e−θ1θx1/x!

e−θ0θx0/x!=

e−θ1θx1/x!

e−1/x!= e−θ1+1θx1 .

Since θ1 > 1, then l10(x) is an increasing function of x. Hence, we look for a test function of

the form:

ϕ(x) =

1 if x > k

c if x = k

0 if x < k

.

Step 1: Pick α, the desired size of the test.

Step 2: Pick k such that Pr(X > k − 1|H0) ≥ α ≥ Pr(X > k|H0)

OR such that Pr(X ≤ k − 1|H0) ≤ α ≤ Pr(X ≤ k|H0).

Suppose α = 0.05. Then computing for a Poisson distribution gives the following table:

k Pr(X ≤ k|H0)

0− 0

0 0.3679

1 0.7358

2 0.9197

3 0.9810

The test function is: ϕ(x) =

1 x > 3

c x = 3

0 x < 3

, where c is chosen

so that: α = E[ϕ(X)|H0]. Here, this gives:

0.05 = 1 · Pr(X > 3|H0) + c · Pr(X = 3|H0)

= (1− .9810) + c(0.9810− 0.9197) ⇒ c ≈ 0.5057.

So, at x = 3, we have about a 50% chance of rejecting or accepting H0.

Distributions with Monotone Likelihood Ratios (MLR) : The Neyman-Pearson Lemma

addresses hypotheses of the form H0: θ = θ0 vs. Ha: θ = θ1. We want to tackle composite

hypotheses of the form: H0: θ = θ0 vs. Ha: θ < θ0 or θ > θ0.

100

Page 3: neypears2.pdf

MLR Families: Let F = fX(x|θ) : θ ∈ Ω ⊂ ℜ. If ∃ a real-valued function t(x)

such that ∀θ, θ′ ∈ Ω with θ < θ′:

fX(x|θ′)fX(x|θ)

is a monotone nondecreasing function of t(x), i.e.: for every θ < θ′, ∃ a monotone nondecreasing function

g(t, θ, θ′) such that fX(x|θ′) = g(t, θ, θ′)fX(x|θ)

then F is an MLR family.

Example: Suppose F is an exponential family so that fX(x|θ) = exp a(θ)b(x) + c(θ) + d(x).Let θ < θ′. Consider the ratio:

fX(x|θ′)fX(x|θ)

= exp [a(θ′)− a(θ)] b(x) + c(θ′ − c(θ) .

If a(θ) is monotone increasing/decreasing, then this is an MLR family in b(x) OR −b(x).

Example: Consider a hypergeometric random variable X. What is the scenario for this

situation?

• N total items, where we take a sample of size n at random with replacement.

• Let D = the number of defectives among the N items and define X = the number of

defectives in the sample of size n. The probability mass function of X is:

fX(x|D) =

D

x

N −D

n− x

N

n

, x =

Let D < D′, so that D′ = D + j for j = 1, 2, . . ..

• To show fX(x|D) is MLR in x, we must show:fX(x|D′)

fX(x|D)is monotone increasing as a

function of x. Consider D′ = D + 1 first:

fX(x|D + 1)

fX(x|D)=

D + 1

x

N −D − 1

n− x

/

N

n

D

x

N −D

n− x

/

N

n

=(D + 1)(N −D − n+ x)

(N −D)(D + 1− y).

Since the numerator increases as a function of x and the denominator decreases as a

function of x, then the ratio increases as a function of x.

• In the general case:

f(x|D′)

f(x|D)=

f(x|D + j)

f(x|D)=

f(x|D + j)

f(x|D + j − 1)· f(x|D + j − 1)

f(x|D + j − 2)· · · f(x|D + 1)

f(x|D),

where each ratio is increasing as a function of x, so that the family of distributions

fX(x|D) : D = 0, 1, . . . , n is MLR in x.

101

Page 4: neypears2.pdf

Karlin-Rubin Theorem: Let F = fX(x|θ) : θ ∈ Ω ⊂ ℜ be an MLR family in t(x).

Then, for testing the hypotheses H0 : θ = θ0 vs. H1 : θ > θ0, ∃ a uniformly most powerful

(UMP) test with test function given by:

ϕ(x) =

1 t(x) > c

γ t(x) = c

0 t(x) < c

, where c & γ are chosen so that Eϕ(X)|H0 = α.

• Note: There is an analogous result forH1: θ < θ0 where the directions of the inequalities

in the test function are simply reversed.

• The power of this theorem is that we need only show a family of distributions is MLR

in a statistic t(x) and we immediately get the form of the UMP test for a composite

one-sided alternative hypothesis.

Proof: Fix θ1 > θ0. By the Neyman-Pearson Lemma, the (U)MP test of H0 : θ = θ0 vs.

H1 : θ = θ1 rejects H0 whenf1(x|θ1)f0(x|θ0)

> k, or equivalently (since fX(x|θ) is MLR), when

t(x) > c, where c & γ are constants chosen such that:

α = Eϕ(X)|H0 ⇒ α = Pr[t(X) > c|H0] + γPr[t(X) = c|H0].

Since the Neyman-Pearson MP test at θ1 > θ0 does not depend on θ1, then this test is most

powerful for all θ > θ0, and thus is uniformly most powerful (UMP).

102