12
PART I Approximation, Bounds, and Inequalities ©2001 CRC Press LLC

C1240 01

Embed Size (px)

Citation preview

Page 1: C1240 01

PART I

Approximation, Bounds, and Inequalities

©2001 CRC Press LLC

Page 2: C1240 01

1

Nonuniform Bounds in ProbabilityApproximations Using Stein’s Method

Louis H. Y. ChenNational University of Singapore, Republic of Singapore

ABSTRACT Most of the work on Stein’s method deals with uniformerror bounds. In this paper, we discuss non-uniform error bounds usingStein’s method in Poisson, binomial, and normal approximations.

Keywords and phrases Stein’s method, non-uniform bounds, proba-bility approximations, Poisson approximation, binomial approximation,normal approximation, concentration inequality approach, binary expan-sion of a random integer

1.1 Introduction

In 1972 Stein introduced a method of normal approximation which doesnot depend on Fourier analysis but involves solving a differential equa-tion. Although his method was for normal approximation, his ideasare applicable to other probability approximations. The method alsoworks better than the Fourier analytic method for dependent randomvariables, particularly if the dependence is local or of a combinatorialnature. Since the publication of this seminal work of Stein, numerouspapers have been written and Stein’s ideas applied in many different con-texts of probability approximation. Most notable of these works are innormal approximation, Poisson approximation, Poisson process approxi-mation, compound Poisson approximation and binomial approximation.An account of Stein’s method and a brief history of its developments canbe found in Chen (1998).In this paper we discuss another aspect of the application of Stein’s

method, not in terms of the approximating distribution but in terms ofthe nature of the error bound. Most of the papers on Stein’s methoddeal with uniform error bounds. We show that Stein’s method can also

©2001 CRC Press LLC

Page 3: C1240 01

be applied to obtain non-uniform error bounds and of the best possibleorder. Roughly speaking, a uniform bound is one on a metric betweentwo distributions. Whereas a non-uniform bound on the discrepancybetween two distributions, L(W ) and L(Z), is one on |Eh(W )−Eh(Z)|,which depends on h for every h in a separating class. We will considernon-uniform bounds in three different contexts, Poisson, binomial, andnormal. In the exposition below, we will focus more on ideas than ontechnical details.

1.2 Poisson Approximation

Poisson approximation using Stein’s method was first investigated byChen (1975a). Since then many developments have taken place andPoisson approximation has been applied to such diverse fields as ran-dom graphs, molecular biology, computer science, probabilistic numbertheory, extreme value theory, spatial statistics, and reliability theory,where many problems can be phrased in terms of dependent events. Seefor example Arratia, Goldstein, and Gordon (1990), Barbour, Holst, andJanson (1992) and Chen (1993). All these results of Poisson approxima-tion concern error bounds on the total variation distance between thedistribution of a sum of dependent indicator random variables and aPoisson distribution. These bounds are therefore uniform bounds.The possibility of nonuniform bounds in Poisson approximation using

Stein’s method was first mentioned in Chen (1975b). For independentindicator random variables, nonuniform bounds were first obtained forsmall and moderate λ by Chen and Choi (1992) and for unrestricted λwith improved results by Barbour, Chen, and Choi (1995). To explainthe ideas behind obtaining nonuniform bounds, we first illustrate howa uniform bound is obtained in the context of independent indicatorrandom variables.Let X1, . . . , Xn be independent indicator random variables with

P (Xi = 1) = 1 − P (Xi = 0) = pi, i = 1, ..., n. Define W =∑n

i=1Xi,W (i) = W − Xi, λ =

∑ni=1 pi and Z to be a Poisson random variable

with mean λ. Let fh be the solution (which is unique except at 0) of theStein equation

λf(w + 1)− wf(w) = h(w)− Eh(Z)

where h is a bounded real-valued function defined on Z+ = {0, 1, 2, . . .}.Then we have

©2001 CRC Press LLC

Page 4: C1240 01

Eh(W )− Eh(Z) = E {h(W )− Eh(Z)}= E {λfh(W + 1)−Wfh(W )}

=n∑

i=1

p2iE�fh(W (i) + 1) (1.2.1)

where �f(w) = f(w + 1) − f(w). A result of Barbour and Eagleson(1983) states that ‖�fh‖∞ ≤ 2(1 ∧ λ−1)‖h‖∞. Applying this result, weobtain

dTV (L(W ),L(Z)) = supA

|P (W ∈ A)− P (Z ∈ A)|

= (1/2) sup|h|=1

|Eh(W )− Eh(Z)|

≤ (1 ∧ λ−1)n∑

i=1

p2i (1.2.2)

where dTV denotes the total variation distance. It is known that theabsolute constant 1 is best possible and the factor (1 ∧ λ−1) has thecorrect order for both small and large values of λ. The significance ofthe factor (1 ∧ λ−1) is explained in Chapter 1 of Barbour, Holst, andJanson (1992).To obtain a nonuniform bound, we let

Ai(r) =P (W (i) = r)P (Z = r)

.

Then (1.2.1) can be rewritten as

Eh(W )− Eh(Z) =n∑

i=1

p2iEAi(Z)�fh(Z + 1) (1.2.3)

where h is no longer assumed to be bounded.Let C∗ = sup1≤i≤n supr≥0Ai(r). Then

|Eh(W )− Eh(Z)| ≤ C∗(

n∑i=1

p2i

)E|�fh(Z + 1)|.

What remains to be done is to calculate or bound C∗ and E|�fh(Z+1)|.In Barbour, Chen and Choi (1995), it is shown that for max1≤i≤n pi ≤1/2, C∗ ≤ 4e13/12

√π and the following theorem was proved.

©2001 CRC Press LLC

Page 5: C1240 01

THEOREM 1.2.1 [Theorem 3.1 in Barbour, Chen, and Choi (1995)]

Let h be a real-valued function defined on Z+ such that EZ2|h(Z)| <∞.We have

|Eh(W )− Eh(Z)|

≤ C∗n∑

i=1

p2i [4(1 ∧ λ−1)E|h(Z + 1)|+ E|h(Z + 2)|

−2E|h(Z + 1)|+ E|h(Z)|]/2. (1.2.4)

If |h| = 1, then we have

dTV (L(W ),L(Z)) = (1/2) sup|h|=1

|Eh(W )− Eh(Z)| ≤ C∗(1 ∧ λ−1)n∑

i=1

p2i

where the upper bound has the same order as that of (2.2), but it has alarger absolute constant. However, the bound in (2.4) allows a very widechoice of possible functions h, and therefore contains more informationthan the total variation distance bound in (2.2).By iterating (2.1), we obtain

Eh(W )− Eh(Z) =n∑

i=1

p2iE�fh(Z + 1) + second order terms

= −12

n∑i=1

p2iE�2h(Z) + second order terms

where E∆fh(Z + 1) = −(1/2)E∆2h(Z) (see, for example, Chen andChoi (1992), p.1871).In Barbour, Chen, and Choi (1995), a more refined result (Theorem

3.2) was obtained by bounding the second order error terms in the sameway the first order error terms were bounded. From this theorem, a largedeviation result (Theorem 4.2) was proved which produces the followingcorollary.

COROLLARY 1.2.2

Let z = λ+ ξ√λ. Suppose max1≤i≤n pi → 0 and ξ = o

([λ/

∑ni=1 p

2i ]

1/2)

as n→ ∞. Then, as n, z and ξ → ∞,

P (W ≥ z)P (Z ≥ z)

− 1 ∼ − ξ2

n∑i=1

p2i .

©2001 CRC Press LLC

Page 6: C1240 01

The following asymptotic result was also deduced.

THEOREM 1.2.3

Let N be a standard normal random variable. Let h be a nonnegativefunction defined on R which is continuous almost everywhere and notidentically zero. Suppose that

{(Z−λ√

λ)4h(Z−λ√

λ) : λ ≥ 1

}is uniformly in-

tegrable. Then as λ→ ∞ such that max1≤i≤n pi → 0,

∞∑r=0

h(r − λ√λ

)|P (W = r)− P (Z = r)| ∼ 12λ

(n∑

i=1

p2i )E|N2 − 1|h(N).

By letting h ≡ 1, E|N2 − 1|h(N) = E|N2 − 1| = 2√2/(πe), and

Theorem 2.3 yields a result of Barbour and Hall (1984a, p. 477) andTheorem 1.2 of Deheuvels and Pfeifer (1986).Nonuniform bounds in compound Poisson approximation on a group

for small and moderate λ were first obtained by Chen (1975b) and latergeneralized and refined by Chen and Roos (1995). In these papers, thetechniques were inspired by Stein’s method. The first paper on com-pound Poisson approximation using Stein’s method directly was by Bar-bour, Chen, and Loh (1992).

1.3 Binomial Approximation: Binary Expansion ofa Random Integer

In his monograph, Stein (1986) considered the following problem. Letn be a natural number and let X denote a random variable uniformlydistributed over the set {0, 1, , n − 1}. Let W denote the number ofones in the binary expansion of X and let Z be a binomial randomvariable with parameters (k, 1/2), where k is the unique integer suchthat 2k−1 < n ≤ 2k. If n = 2k, then W has the same distribution as Z,otherwise it is a sum of dependent indicator random variables.By using the solution of the Stein equation

(k − x)f(x)− xf(x− 1) = h(x)− Eh(Z) (1.3.1)

where h = I{r} and r = 0, 1, . . . , k, Stein (1986) proved that

sup0≤r≤k

|P (W = r)− P (Z = r)| ≤ 4/k.

Diaconis (1977), jointly with Stein, proved a normal approximation re-sult for W with an error bound of order 1/

√k. A combination of this

©2001 CRC Press LLC

Page 7: C1240 01

result with the normal approximation to the binomial distribution showsthat sup0≤r≤k |P (W ≤ r)− P (Z ≤ r)| is of the order of 1/

√k.

Loh (1992) obtained a bound on the solution of a multivariate versionof (3.1) using the probabilistic approach of Barbour (1988). Using thisresult of Loh and arguments in Stein (1986), we can obtain a bound oforder 1/

√k on the total variation distance between L(W ) and L(Z).

In an unpublished work of Chen and Soon (1994) which was basedon the Ph.D. dissertation of the latter, the method of obtaining nonuni-form bounds in Poisson approximation was applied to the approximationof L(W ) by L(Z). Apart from proving other results, this work showsthat the total variation distance between L(W ) and L(Z) is, in manyinstances, of much small order than 1/

√k.

Let X =∑k

i=1Xi2k−i for the binary expansion of X and W =∑ki=1Xi. In Stein (1986, pp. 44–45), it is shown that

Eh(W )− Eh(Z) = EQfh(W ) (1.3.2)

where Q = |{j : Xj = 0 or X+2k−j ≥ n}| and fh is the solution of (3.1)with h being a real-valued function defined on {0, 1, . . . , k}. Define

ψ(r) = E[Q|W = r] and A(r) =P (W = r)P (Z = r)

.

Then (3.2) can be written as

Eh(W )− Eh(Z) = Eψ(Z)A(Z)fh(Z). (1.3.3)

Let lk be the number of consecutive 1s, starting from the beginning inthe binary expansion of n− 1. The relationship between n− 1 and lk isgiven by

n− 1 =lk∑

i=1

2k−i +m

where 0 ≤ m < 2k−lk−1. It is shown in Chen and Soon (1994) that for0 ≤ r ≤ k − 1, lk/k ≤ A(r) ≤ 2. By obtaining upper and lower boundson the right hand side of (3.3), the following theorem was proved.

THEOREM 1.3.1

Assume that 2k−1 < n < 2k.(i) If limk→∞ lk√

k= ∞, then

dTV (L(W ),L(Z)) � 2−lk .

(ii) If lim supk→∞lk√k<∞, then

dTV (L(W ),L(Z)) � 2−lklk√k

©2001 CRC Press LLC

Page 8: C1240 01

where xk � yk means that there exist positive constants a < b such thata ≤ xk/yk ≤ b for sufficiently large k.

From this theorem it follows that

dTV (L(W ),L(Z)) � 1√k

if and only if0 < lim inf

k→∞lk ≤ lim sup

k→∞lk <∞.

The following theorems were also proved.

THEOREM 1.3.2

|Eh(W )− Eh(Z)|

≤ 13√kE

{∣∣∣∣∣Z − [k/2]− 1√k/4

∣∣∣∣∣ (|h(Z)|+ |h(Z + 1)|+ 2|Eh(Z)|)}.

THEOREM 1.3.3

Let a = [k/2] + bk where bk/√k → ∞ and bk/k → 0 as k → ∞. If lk = l

for all sufficiently large k, then

P (W ≥ a)P (Z ≥ a)

− 1 ∼ −2ψ([

k

2

])bkk

as k → ∞, where l(1/2− (l − 1)/[2(k − l + 1)])l+1 < ψ([k/2]) ≤ 3.

Theorem 3.3 is in fact a corollary of a more general large deviationtheorem.

1.4 Normal Approximation

LetX1, . . . , Xn be independent random variables with EXi = 0, var(Xi)= σ2

i , E|Xi|3 = γi < ∞ and∑n

i=1 σ2i = 1. Let F be the distribution

function of∑n

i=1Xi and let Φ be the standard normal distribution func-tion. The Berry-Esseen Theorem states that

sup−∞<x<∞

|F (x)− Φ(x)| ≤ Cn∑

i=1

γi

©2001 CRC Press LLC

Page 9: C1240 01

where C is an absolute constant. The smallest value of C, obtained sofar by Van Beek (1972) (without using computers), is 0.7975.

If X1, . . . , Xn are independent and identically distributed, then

sup−∞<x<∞

|F (x)− Φ(x)| ≤ Cnγ

where γ = γi for i = 1, . . . , n. Nonuniform bounds were first obtainedby Esseen (1945) who proved that for the i.i.d. case

|F (x)− Φ(x)| ≤ λ log n√n(1 + x2)

and|F (x)− Φ(x)| ≤ λ log(2 + |x|)√

n(1 + x2)where λ depends on n3/2γ. Nagaev (1965) improved the upper boundsto Cnγ/(1+ |x|3), also for the i.i.d. case. This was generalized by Bikelis(1966) who proved that, for independent and not necessarily identicallydistributed random variables,

|F (x)− Φ(x)| ≤ C∑n

i=1 γi

1 + |x|3where C is an absolute constant. Paditz (1977) calculated C to be 114.7and Michel (1981) reduced it to 30.54 for the i.i.d. case. All the aboveproofs used the Fourier analytic method. Chen and Shao (2000) usedStein’s method to prove the following more general result:

|F (x)−Φ(x)| ≤ Cn∑

i=1

{EX2i I(|Xi| > 1 + |x|)

(1 + |x|)2 +E|Xi|3I(|Xi| ≤ 1 + |x|)

(1 + |x|)3}

where the existence of third moments is no longer assumed. Their proofis based on truncation and the concentration inequality approach. Theconcentration inequality approach was originally used by Stein for thei.i.d. case (see Ho and Chen (1978)). It was extended by Chen (1986)to dependent and non-identically distributed random variables with ar-bitrary index set. A proof of the Berry-Esseen Theorem for independentand non-identically distributed random variables using the concentrationinequality approach is given in Section 2 of Chen (1998).

The concentration inequality approach is not the only approach forobtaining Berry-Esseen bounds using Stein’s method. Another approachbased on inductive arguments has been used by Barbour and Hall (1984b),Bolthausen (1984) and Stroock (1993).We would like to mention in passing that Stein’s method has also been

applied to obtain bounds on the total variation distances between thestandard normal distribution and distributions satisfying certain varia-tional inequalities. See Utev (1989) and Cacoullos, Papathanasiou, andUtev (1994).

©2001 CRC Press LLC

Page 10: C1240 01

1.5 Conclusion

We would like to conclude by saying that there is much more to be donein the direction of nonuniform bounds, particularly for dependent ran-dom variables both in Poisson approximation and normal approximation.The large deviation results referred to in the above sections are actuallythose of moderate deviation. A related question therefore is how Stein’smethod can be applied to obtain results which cover both moderate andreally large deviations.

Acknowledgement This work is partially supported by grantRP3982719 at the National University of Singapore. I would like tothank K. P. Choi and Qi-Man Shao for their help in preparing themanuscript and for their helpful comments.

References

1. Arratia, R., Goldstein, L., and Gordon, L. (1990). Poisson approxi-mation and the Chen-Stein method. Statistical Science 5, 403–434.

2. Barbour, A. D. (1988). Stein’s method and Poisson process con-vergence. Journal of Applied Probability 25 (A), 175–184.

3. Barbour, A. D., Chen, L. H. Y., and Choi, K. P. (1995). Poissonapproximation for unbounded functions, I: independent summands.Statistica Sinica 5, 749–766.

4. Barbour, A. D., Chen, L. H. Y., and Loh, W. L. (1992). Com-pound Poisson approximation for nonnegative random variablesvia Stein’s method. Annals of Probability 20, 1843–1866.

5. Barbour, A. D. and Eagleson, G. (1983). Poisson approximation forsome statistics based on exchangeable trials. Advances in AppliedProbability 15, 585–600.

6. Barbour, A. D. and Hall, P. (1984a). On the rate of Poisson conver-gence. Mathematical Proceedings of the Cambridge PhilosophicalSociety 95, 473–480.

7. Barbour, A. D. and Hall, P. (1984b). Stein’s method and the Berry-Esseen theorem. The Australian Journal Statistics 26, 8–15.

8. Barbour, A. D., Holst, L., and Janson, S. (1992). Poisson Ap-proximation. Oxford Studies in Probability 2, Clarendon Press,Oxford.

9. Bikelis, A. (1966). Estimates of the remainder in the central limit

©2001 CRC Press LLC

Page 11: C1240 01

theorem. Litovsk. Mat. Sb. 6(3), 323–346 (in Russian).10. Bolthausen, E. (1984). An estimate of the remainder in a combina-

torial central limit theorem. Zeitschrift Wahrscheinlichkeitstheorieund Verwandte Gebiete 66, 379–386.

11. Cacoullos, T., Papathanasiou, V. and Utev, S. A. (1994). Varia-tional inequalities with examples and an application to the centrallimit theorem. Annals of Probability 22, 1607–1618.

12. Chen, L. H. Y. (1975a). Poisson approximation for dependent tri-als. Annals of Probability 3, 534–545.

13. Chen, L. H. Y. (1975b). An approximation theorem for convolu-tions of probability measures. Annals of Probability 3, 992–999.

14. Chen, L. H. Y. (1986). The rate of convergence in a central limittheorem for dependent random variables with arbitrary index set.IMA Preprint Series #243, University of Minnesota.

15. Chen, L. H. Y. (1993). Extending the Poisson approximation. Sci-ence 262, 379–380.

16. Chen, L. H. Y. (1998). Stein’s method: some perspectives withapplications. Probability Towards 2000 (Eds., L. Accardi and C.Heyde), pp. 97–122. Lecture Notes in Statistics No. 128. SpringerVerlag.

17. Chen, L. H. Y. and Choi, K. P. (1992). Some asymptotic and largedeviation results in Poisson approximation. Annals of Probability20, 1867–1876.

18. Chen, L. H. Y. and Roos, M. (1995). Compound Poisson approx-imation for unbounded functions on a group, with application tolarge deviations. Probability Theory and Related Fields 103, 515–528.

19. Chen, L. H. Y. and Shao, Q. M. (2000). A non-uniform Berry-Esseen bound via Stein’s method. Preprint.

20. Chen, L. H. Y. and Soon, S. Y. T. (1994). On the number ofones in the binary expansion of a random integer. Unpublishedmanuscript.

21. Deheuvels, P. and Pfeifer, D. (1986). A semigroup approach toPoisson approximation. Annals of Probability 14, 663–676.

22. Diaconis, P. (1977). The distribution of leading digits and uniformdistribution mod 1. Annals of Probability 5, 72–81.

23. Esseen, C.-G. (1945). Fourier analysis of distribution functions: amathematical study of the Laplace-Gaussian law. Acta Mathemat-ica 77 1–125.

©2001 CRC Press LLC

Page 12: C1240 01

24. Ho, S. T. and Chen, L. H. Y. (1978). An Lp bound for the remain-der in a combinatorial central limit theorem. Annals of Probability6, 231–249.

25. Loh, W. L. (1992). Stein’s method and multinomial approximation.Annals of Applied Probability 2, 536–554.

26. Michel, R. (1981). On the constant in the non-uniform version ofthe Berry-Esseen Theorem. Zeitschrift Wahrscheinlichkeitstheorieund Verwandte Gebiete 55, 109–117.

27. Nagaev, S. V. (1965). Some limit theorems for large deviations.Theory of Probability and its Applications 10, 214–235.

28. Paditz, L. (1977). Uber die Annaherung der Verteilungsfunktionenvon Summen unabhangiger Zufallsgroben gegen unberrenzt teil-bare Verteilungsfunktionen unter besonderer berchtung derVerteilungsfunktion der standarddisierten Normalverteilung. Dis-sertation, A.TU Dresden.

29. Soon, S. Y. T. (1993). Some Problems in Binomial and CompoundPoisson Approximations. Ph.D. dissertation, National Universityof Singapore.

30. Stein, C. (1972). A bound for the error in the normal approx-imation to the distribution of a sum of dependent random vari-ables. Proceedings of the Sixth Berkeley Symposium on Mathe-matics, Statistics and Probability 2, 583–602, University CaliforniaPress. Berkeley, California.

31. Stein, C. (1986). Approximation Computation of Expectations.Lecture Notes 7, Institute of Mathematics and Statistics, Hayward,California.

32. Stroock, D. W. (1993). Probability Theory: An Analytic View.Cambridge University Press, Cambridge, U.K.

33. Utev, S. A. (1989). Probability problems connected with a certainintegrodifferential inequality. Siberian Mathematics Journal 30,490–493.

34. Van Beek, P. (1972). An approximation of Fourier methods tothe problem of sharpening the Berry-Esseen inequality. ZeitschriftWahrscheinlichkeitstheorie und Verwandte Gebiete 23, 187–196.

©2001 CRC Press LLC