Weighted bootstrapping of U-statistics

Journal of Statistical Planning and Inference 38 (1994) 31-42

North-Holland

31

Weighted bootstrapping of U-statistics

Paul Janssen”

Department of Statistics, Limburgs Universitair Centrum, B-3590 Diepenbeek, Belgium

Received 19 November 1991; revised manuscript received 6 October 1992

Recommended by M.L. Puri

Abstract

We introduce a generalized bootstrap procedure for U-statistics. The idea is to reweight the terms of the

original U-statistic by stochastic weights. Specific choices of the weights lead to well-known resampling

schemes including Efron’s bootstrap and Bayesian bootstrap. We establish the asymptotic consistency of

the weighted bootstrap for U-statistics and studentized U-statistics. Our results extend the recent work by

Mason and Newton (1992) on weighted bootstrapping of means.

AMS Subject Classification. Primary 62E20, secondary 60F05.

Key words: Asymptotic bootstrap consistency; U-statistics and studentized LI-statistics; weighted boot-

strapping

1. Introduction

For U-statistics we introduce a generalized bootstrap procedure. The asymptotic

consistency of generalized bootstrapped (studentized) U-statistics will be established.

Our results extend recent work on generalized bootstrapped means by Mason and

Newton (1992).

Let X1,X,, . . . . denote i.i.d. random variables with common distribution function

F. For a given U-statistic with (for simplicity) a symmetric kernel h of degree 2, i.e.

u,= ; 0

-’ C h(Xi,Xj) (1.1)

l<i<j<n

satisfying E 1 h(X,, X,) ( < co, we define the corresponding generalized bootstrapped

U-statistic as

‘Dn = C DniDnjh(Xi, Xj), l<i#j<n

(1.2)

Correspondence to: P. Janssen, Department of Statistics. Limburgs Universitair Centrum, Universitaire

Campus, B-3590 Diepenbeek, Belgium.

* Research partly supported by the NFWO (National Science Foundation), Belgium.

037%3758/94/%07.00 0 1994 - Elsevier Science B.V. All rights reserved. SSDI 0378-3758(92)00156-R

32 P. JanssenlBootstrapping of U-statistics

where Dn=(Dnl, . . . . D,,) is a vector of random weights independent of the data

X i, . . . , X,. Typically, D,i>O and Cl= 1 D”i = 1; therefore, UDn is a stochastically

reweighted version of the original U-statistic U,.

Efron’s bootstrapped U-statistics are of the form (1.2) with multinomial weights

(no nl ,..., nD,,.)-Mult,,{ n,:, . . . . i).

This is resampling from the empirical distribution function

F,(x)=n-’ i l{Xi<X}. i=l

Bayesian bootstrapped U-statistics are of the form (1.2), with D, the vector of

one-step spacings, i.e.

Dni=Ui:n_1-Ui-I:“_1, i=l,..., n, (1.3) where

denote the order statistics corresponding to a sequence of independent uniform (0,l)

variables U1, . . . , Unml independent of the Xr’s. Since the distributional behavior of

A,i=nD”i, with D”i as in (1.3), is the same as that of (2,/Z, . . ..Z./Z) where Z1, . . . . Z,

are independent, exponentially distributed random variables with mean 1 and where

Z=n-’ Cl=, Zi; an equivalent way to define a Bayesian bootstrapped U-statistic is

with

UD~=$ C A,iA,jh(Xi, Xj) 1 <ifj<n

&=$.

(1.4)

Taking h(x, y) = (x + y)/2, (1.4) reduces to the Bayesian bootstrapped mean. See Rubin

(198 1) and Lo (1987) for this special case.

The previous discussion provides a general idea to generate weights. Independent of

the XI)s, let cl, cz, . . . be any sequence of strictly positive i.i.d. random variables,

satisfying Et:< CO and let f=n-l CT=, 5i. We th en consider generalized boot-

strapped U-statistics of form (1.4), with weights

A .=r,, nt r (1.5)

To mention an example, take t1 -Gamma(4, l), which corresponds to weights

suggested in Weng (1989) for the bootstrapped sample mean.

Our main theorem states that, given X1, . . . . X,, the distribution function of

n112( UDn- U,), with Uo,, defined by (1.4) and (1.5), provides a consistent estimator for

the unknown distribution function of n ‘I’( U ,, - 0), where 8 = Eh(Xl , X,). The result is

P. JanssenlBootstrapping of U-statistics 33

established in Section 2. In Section 3 we will extend our main result to the case of

studentized U-statistics. Finally, we list some possible refinements and extensions in

Section 4.

At this stage we make some remarks. We first note that weights of the form (1.5) do

not include the multinomial weights, corresponding to Efron’s bootstrap. However,

our method of proof can easily be adapted for multinomial weights, resulting in an

alternative proof for the Bickel and Freedman (1981) result for U-statistics. The

advantage of restricting attention to weights of the form (1.5) is that we can present

appealing and direct proofs of the results in Sections 2 and 3 based on simple

probabilistic arguments. For a more detailed discussion we refer to Section 4.

2. Main result

As notation we use P, E, . . . to denote probability, expectation, . . . under F and P,,,

E Dn, . . . to denote the conditional probability, expectation, . . . given Xi, . . . . X,. We

further define, for an arbitrary distribution function K,

g(x, K)= h(x, y)dK(y)- s ss

h(x> ~)dK(x)dWy).

Note that g(x, F)=fh(x, y)dF(y)-8 and define

a;=Eg’(X,, F).

Finally, define Et, =p, Var c1 =a2 and e=p/a.

Theorem 1. If ai>0 and max{Eh’(X,,X,), Eh2(X1,X,)}<co then, with

P-probability 1,

sup xeR

l-0, n+ cx3. (2.1)

To prove this z result we use the following decomposition.

Lemma 1. Zfmax{Eh(X,, X2), Eh(X1, Xl)} < 00, then

U,n-U,=Son+R~~-R(D~-RRb3,)+R(D41,

where, with I& Y, F)=W, y)-O-gb, F)-dy, F),

SD,,=: ,i (dni-ll)g(Xi, Fh t-1

Rg,f=$ i i (dnidnj-l)Ic/(Xi, Xj, F), i=l j=l

(2.2)

34 P. JansseniBootstrapping of U-statistics

Rt2’= 0” & $Y ,i .f; h(Xi, Xi),

1-l J-1

Proof. First note that

NOW use the fact that h(Xi, Xj)=B+g(X,, F)+g(Xj, F)+$(Xi, Xj, F) and that

Cy= 1 I:= 1 DniDnj= 1 to obtain (2.2). 0

Proof of Theorem 1. To prove the validity of (2.1) we use the decomposition (2.2) and

show that for each E>O

P&In “2(Rt$I>~) +O a.s.[P], (2.3)

withj= 1, . . . . 4; and that the conditional distribution of nliz SD,, tends to the appropri-

ate normal limit. To handle nri2SD~ we use the decomposition

72”‘s

Note that n1’2(~-~)/~Fis bounded in probability and that, with Etf=q2,

+p2 1 S(Xi, F)g(Xj> F)

i#j

(2.4)

(2.5)

The r.h.s. of (2.5) converges to zero a.s.[P]. Since by the strong law we have

n-r C~=rg2(Xi, F) --t a,Z, a.s.[P] and n-’ cizj y(Xi, P)g(Xj, F) -+ 0, a.s.[P]. This

implies that

n’/‘s” 2 n Dn=p

Xc > 5-l g(xi, F)

i=l P-


is the leading term in (2.4). It is therefore sufficient to prove that, with P-probability 1,

sup (P,“{n1’2Q~~“~x}-@ & I +o, ( 1

n-km. XSR 9

(2.6)

In (2.6), @ denotes the standard normal distribution.

To prove (2.6) we only have to check the Lindeberg condition for ~“~5~~. With, for

any z>O,

D.(r)=; ,i g2(Xi, F)E 1-l

where the expectation is w.r.t. the ti’s; the Lindeberg condition states that, for any

z > 0,

D,(r) -+ 0 a.s.[P]. (2.7)

To obtain (2.7) note that

Since Eg2(X1, F) < 03 implies

max Sztxi2 F,

-b 0 a.s.[P] l<i<n n

it holds as. [P] that for E > 0 and II sufficiently large

(2.8)

(2.9)

The r.h.s. of (2.9) converges by the strong law a.s.[P] to

which in turn converges to zero as E tends to zero.

In the second part of the proof we show the validity of (2.3) for j = 1, . . . ,4. For j = 2

and j=4 this is trivial. For j= 1 note that

R(‘)=R(‘“)+R(‘b) D, D, n, +R’d,“‘,

with (use $(Xi, Xj) as shorthand notation for Il/(Xi, Xj, F))

36 P. JanssenjBootstrapping of U-statistics

To show that n”’ R’d,“’ satisfies (2.3) note that

(2.10)

where

T,,=E 2 c t,b2(xi,xj), i#j

where C* denotes summation over all i #j and k # 1 satisfying #{i, j) n {k, I} = 1. Also

note that, since

E[(+)(y-I)]=o, the quadruple term in E,n[(RElna’)2] disappears. Since E ((t1t2/p2)- 1)2 < CO it easily

follows by the SLLN that np2T,, + 0, a.s.[P]. Since

E[(+)(+)]<m

and since E[$(X,, X,)$(X,, X3)] =O, it easily follows that C3Tn2 --f 0, a.s.[P].

The proof that n112 Rgnb’ satisfies (2.3) follows along the same lines (now using a first

absolute moment, so that we do not need more than Et: < co). To handle n1’2Rg:’

note that this term can be factorized as

where n112 (F’- ,u2)/c2p2 is bounded in probability; furthermore,

=$ i$l j$l t5i5j-P2)ti(xi, xj)+$ is1 j$l @txi, xj). (2.11)

The first term in (2.11) equals p-‘R(dna) and therefore has the appropriate behavior.

The second term in (2.11) converges a.s.[P] to zero.

The proof of (2.3) for j=3 is similar but simpler and is therefore omitted. 0

3. Studentized U-statistics

In this section we apply the generalized bootstrap method to approximate the

distribution function of standardized and studentized U-statistics. Let, with

P. .lanssen/Bootstrapping of U-statistics 31

4cri=4Eg2(X1, F), F”(x) denote the distribution function of the standardized U-

statistic, i.e.

i

u -0 F”(x)=P n”2-<x

20, I (3.1)

An empirical version of 40: is given by

which is the average of the random variables 4g2(Xi, F,), i= 1, . . . , n. Note that a.’ is

essentially equivalent to the jackknife estimator of 40; studied by Callaert and

Veraverbeke (1981) and it can be rewritten as

(3.2)

with

pni=k ,i; h(Xi, xj)

J-1

(3.3)

and

vnzf ,$ i h(Xi, Xj). (3.4) 1-1 j=l

The bootstrap estimator of (3.1) is given by

FDn(x)=pDn n1j2 e uDn-un<X

i

. OrI I

Finally, we define the distribution function of the studentized U-statistic:

and its bootstrap estimator

GD,(x)=P,,, n”’ Q uD”-u”<~ i

, OD”

where c& the bootstrap version of cri, is defined as

(3.5)

O~n=4 cD,i(P,,i- vDn)2=4 f Dnip&i- ‘h ) i=l

38 P. Janssen/Bootstrapping of’ U-statistics

with

P,“i= i DnjhtXi, Xj),

j=l

Theorem 2. Assume oi>O. Zf max{EhZ(X,,X,), Eh*(Xi,X,)}<co then, with

P-probability 1,

sup I F”(X) - F”,“(X) I + 0, n + co. (3.6) XSR

Zfmax{Eh4(X1, X2), Eh4(X,, Xi)} < 00, then with P-probability I

sup/G,(x)-G,,,(x)l-+O, n-t co. XER

(3.7)

Remark. Note that the bootstrap result for the standardized U-statistic given in (3.6)

is of little practical value, since computation of e.g. a bootstrap confidence interval for

t3 would require a priori knowledge of C* which is typically not available. Therefore,

(3.7) is the result with statistical relevance.

Proof of Theorem 2. From Theorem 1 it is clear that (3.6) follows by a Slutsky

argument if we show that

cJ.‘+g* as. [PI. (3.8)

Similarly for (3.7) it is sufficient to show that, with P-probability 1, for all E >O

P&i I 4” -OfI >E}

=mJ~, -cT,21>&IX1 )...) X,}+O, n-too. (3.9)

As shorthand notation for (3.9) we use

&,-a;- “” 0 a.s.[P].

The as. convergence in (3.8) follows easily by the strong law for von Mises statistics.

Indeed CJ~= W,-4 Vz, where

wn=$ ,i i i hCXi, Xj)h(Xi, xk) r-l j=l k=l

and W, --+ 4E[h(X,, X,)h(X,, X,)], a.s.[P], and V, + 8, a.s.[P]. Therefore c,’ -+ cr2,

a.s.[P].


To prove (3.9) it is sufficient to show (3.10)+3.12) below:

u&- u, pD” -0 a.s.[P], (3.10)

t fl (d,i-l)P~~iLO a.s.[P], (3.11)

t jl (P~"i-P,zi)~O a.s.[PI. (3.12)

Proof of (3.10)-(3.12): From (2.2)-(2.5) it is immediate that we only have to

establish that

h”=; ,i ( ) ;- 1 LJ(Xi, F)P..‘ 0 a.s.[P] 1-l

(3.13)

since all other terms in the decomposition are of lower order. Now (3.13) is immediate

since (note that the cross-product terms disappear)

PD”{I~D”I>E}QE-2~o~ E(sa,)‘1

+ 1-l ( 1 2 i$l lJ2(Xi, F) + 0, n + 00 a.s.CPl. P

Since (3.11) and (3.12) are proved in a similar way, we only sketch the proof of (3.12).

Plug in the definitions of PD,i and P”i to obtain

i it1 (P~ni-pP,2i)=B”l(d”)+B”2(d,), (3.14)

where, for a vector a,=(~,~, . . . , an”),

B.,(U.)=$ ,i i (Uij-1)h2(Xi, xj)3

r-l j=l

We only handle Bn2(d,) since B,1 (A,) is of lower order. With 0, = (51 /p, . . . ,5,/p), we

have

~“2(~,)=~“2(~“)+(~“2(~“)--B,2(~“)). (3.15)

Then one can show that the second term on the r.h.s. of (3.15) is of lower order than

B,,(6),) (use arguments similar to those used to handle the second term on the r.h.s. of

(2.4)). Now note that

40 P. JanssenjBootstrapping of U-statistics

Finally, it can be shown that n6ED,(BL(0,)) can be decomposed into a sum of six

terms, each term contains at least three and at most six summands. The only term

containing six summands is given by

x &ft, XMXI~ XJ

Since the expectation in (3.16) equals zero and since, a.s. [P],

(3.16)

nw6 1 C NXi9 Xj)h(Xi, Xk)WXt, X,)h(Xt, XC) + (Eg2(X~))2, n + a i#t j+kfs#f

this term disappears. To handle these six terms we encounter moment conditions on

the kernel h, of which the most restrictive one is

max{Eh4(X,, X,), Eh4(X1, Xl)} < co. 0

4. Remarks and possible extensions

We already mentioned that the multinomial weights, corresponding to Efron’s

bootstrap scheme, are not of form (1.5) and that our method of proof to establish

Theorem 1 can be adapted for multinomial weights. The required modifications are:

(i) Instead of using a further decomposition for Son as in (2.4) in combination with

an application of Lindeberg’s theorem; handle SD,, by a modification of the rank

statistic approach discussed in Mason and Newton (1991) for the standardized mean

or by an application of results in Singh (1981).

(ii) Use the Chebyshev inequality and explicit expressions for moments of the

multinomial distribution to prove (2.3).

Also the proof of Theorem 2 can be adapted to include multinomial weights. As

a consequence we can state that, based on this method of proof, we can provide an

alternative proof for the Bickel and Freedman (1981) result for U-statistics.

We also note that a rank statistic approach to study the consistency of boot-

strapped U-statistics with general weights, including multinomial weights and weights

of the form (1.5) is developed by Huskova and Janssen (1993).

For multinomial weights it is well known that

sup/G,(x)-G,,(x)l=o(n-I”) xcR

(4.1)

(see e.g. Helmers (1991)). An interesting open problem is to establish a result like (4.1)

under sufficient conditions on the weights. A related question is to determine weights

that are optimal, where optimality is measured by looking at higher-order terms in the

asymptotic distributional behavior. It is indeed clear from our asymptotic consistency


results that the different bootstrap procedures are first-order-equivalent. Therefore,

higher-order asymptotics will be required to differentiate among different types of

weights. These problems are quite laborious and are outside the scope of this paper.

Weights of form (1.5) and the the even more simple i.i.d. weights will be the typical

choice to deal with higher-order asymptotics, as is already illustrated for boot-

strapped means in Haeusler et al. (1991). The study of the exchangeably weighted

bootstrap of the general empirical process by Praestgaard and Wellner (1993) pro-

vides a further example of the usefulness of i.i.d. weights. We finally remark that the

condition Eh2(X1, XI) < co in Theorem 1 is superfluous (see Dehling and Mikosch

(1992) for details).

Acknowledgement

The author thanks the two referees for critical reading of the manuscript.

References

Athreya, K.B., M. Ghosh, L. Low and P.K. Sen (1984). Laws of large numbers for bootstrapped U-statistics.

J. Statist. Plann. Inference 9, 185-194.

Bickel, P.J. and D. Freedman (1981). Some asymptotic theory for the bootstrap. Ann. Statist. 9, 1196-1217

Callaert, H. and N. Veraverbeke (1981). The order of the normal approximation for a studentized

U-statistic. Ann. Statist. 9, 194-200.

Dehling, H. and T. Mikosch (1992). Random Quadratic Forms and the Bootstrap for U-statistics. Technical

Report.

Haeusler, E., D. Mason and M. Newton (1991). Weighted bootstrapping of means. CWI Quart. 4,213-228. Helmers, R. (1991). On the Edgeworth expansion and the bootstrap approximation for a studentized

U-statistic. Ann. Slatist. 19, 47&484

HuSkovH, M. and P. Janssen (1993). Generalized bootstrap for studentized U-statistics: a rank statistic approach. Statist. Prob. Lett. 16, 225-233.

Lo, A.Y. (1987). A large sample study of the Bayesian bootstrap. Ann. Statist. 15, 360-375.

Mason, D.M. and M.A. Newton (1992). A rank statistics approach to the consistency of a general bootstrap.

Ann. Statist. 20, 1611-1624.

Praestgaard, J. and J. Wellner (1993). Exchangeably weighted bootstrap of the general empirical process,

Ann. Probability (to appear).

Rubin, D. (1981). The Bayesian bootstrap. Ann. Sratist. 9, 136134.

Singh, K. (1981). On the asymptotic accuracy of Efron’s bootstrap. Ann. Statist. 9, 1187-l 195. Weng, C.S. (1989). On a second-order asymptotic property of the Bayesian bootstrap mean. Ann. Statist. 17,

705-710.

Documents

Weighted bootstrapping of U-statistics