Neumeyer N. (2009). Smooth residual bootstrap for empirical processes of non-parametric regrsesion residuals

Embed Size (px)

Citation preview

  • 8/7/2019 Neumeyer N. (2009). Smooth residual bootstrap for empirical processes of non-parametric regrsesion residuals

    1/25

    Scandinavian Journal of Statistics, Vol. 36: 204228, 2009

    doi: 10.1111/j.1467-9469.2008.00628.x

    2009 Board of the Foundation of the Scandinavian Journal of Statistics. Published by Blackwell Publishing Ltd.

    Smooth Residual Bootstrap for EmpiricalProcesses of Non-parametric RegressionResiduals

    NATALIE NEUMEYER

    Department of Mathematics, University of Hamburg

    ABSTRACT. The aim of this paper is to prove the validity of smooth residual bootstrap versions

    of procedures that are based on the empirical process of residuals estimated from a non-parametric

    regression model. From this result, consistency of various model tests in non-parametric regression

    is deduced, such as goodness-of-fit tests for the regression and variance function, tests for equality

    of regression functions and tests concerning the error distribution.

    Key words: empirical distribution function, goodness-of-fit tests, kernel estimators, resampling

    1. Introduction

    For a few decades in statistical research, non-parametric regression models have been inves-

    tigated intensively. Research has focused mainly on the non-parametric estimation of the

    regression function and variance function and corresponding hypothesis tests. In recent years,

    also the estimation of the distribution of the unobserved errors has received some attention.

    Weak convergence of the empirical process based on non-parametrically estimated residualswas shown by Akritas & Van Keilegom (2001). The empirical distribution function of esti-

    mated errors recently turned out to be valuable for goodness-of-fit tests concerning the

    regression function, see for instance Pardo Fernndez et al. (2007) and Van Keilegom et al.

    (2008). In the problems mentioned, the asymptotic distribution of estimators and test statis-

    tics depends heavily on unknown features such as the error density. In situations like these,

    to circumvent problems with the accuracy of the critical values, often resampling procedures

    such as bootstrap are applied. To the authors best knowledge, in non-parametric regression,

    bootstrap methods were first investigated by Hrdle & Bowman (1988), who established

    the (classical) residual bootstrap. Hrdle & Mammen (1993) introduced the so-called wild

    bootstrap (Wu, 1986) in the situation of non-parametric regression.In the context of residual-based empirical processes, the wild bootstrap method has the

    disadvantage that it changes the error distribution, even asymptotically, which is discussed

    in detail in Neumeyer (2006). Further the classical residual bootstrap has the disadvantage

    that, conditional on the original sample, the new bootstrap observations have a discrete error

    distribution. The methods applied when dealing with residual-based empirical processes heav-

    ily rely on a smooth error distribution. It has been observed and will be demonstrated that

    classical bootstrap methods do not work well in this context. In the context of empirical pro-

    cesses of residuals from a linear model this was already observed by Koul & Lahiri (1994),

    Koul (2002, chapter 6) and De Angelis et al. (1993). In the non-parametric regression con-

    text, the smooth residual bootstrap, which will be explained throughout this work, was usedin simulations and data analysis by Pardo-Fernandez et al. (2007) and Van Keilegom et al.

    (2008), but never has a proof for the consistency in such a context been given. In the differ-

    ent context of a linear model with fixed design, the validity of smooth residual bootstrap

    procedures for residual-based empirical processes was shown by Koul & Lahiri (1994) and

  • 8/7/2019 Neumeyer N. (2009). Smooth residual bootstrap for empirical processes of non-parametric regrsesion residuals

    2/25

  • 8/7/2019 Neumeyer N. (2009). Smooth residual bootstrap for empirical processes of non-parametric regrsesion residuals

    3/25

    206 N. Neumeyer Scand J Statist 36

    moment, E[4i ]

  • 8/7/2019 Neumeyer N. (2009). Smooth residual bootstrap for empirical processes of non-parametric regrsesion residuals

    4/25

    Scand J Statist 36 Smooth residual bootstrap 207

    Theorem 1

    Assume model (1) is valid under assumptions A.1A.4. Then the process Rn(y) =

    n (Fn(y)F(y)), y R, converges weakly to a centred Gaussian process G with covariance cov(G(y),G(z)) = F(y

    z)

    F(y)F(z) + f(y) f(z)2 + f(y)U(z) + f(z)U(y), where U(y) = E[1I

    {1

    y

    }].

    Note that the model considered by Akritas & Van Keilegom (2001) is different from model

    (1). First of all, their model is heteroscedastic, further the regression and variance functions

    are defined as certain L-functionals depending on a non-constant score function J in [0, 1].

    Our model corresponds to a homoscedastic version with constant score function J I[0, 1].Slight modifications of Akritas & Van Keilegoms (2001) arguments show the validity

    of theorem 1 under assumptions A.1A.4, which are less restrictive than in the afore-

    mentioned publication [also see the proof of theorem 3.1 by Van Keilegom et al. (2008)].

    The covariance in theorem 1 depends heavily on the unknown error distribution F and its

    density. For the calculation of confidence regions or for testing problems, the use of the limit

    distribution is difficult because the asymptotic covariance has to be estimated by inserting

    non-parametric estimators for F, f, 2 and U. The rate of convergence of some of these esti-

    mators is usually rather slow and it is recommended to use a resampling procedure instead

    of the asymptotic distribution. In section 3 we will discuss the validity of a smooth residual

    bootstrap version of the stochastic process considered in theorem 1.

    3. Smooth residual bootstrap

    Our aim is to construct a bootstrap version of the residual-based empirical process that shares

    the limiting process of theorem 1 in terms of conditional weak convergence given the initial

    sample Yn =

    {(X1, Y1), . . ., (Xn, Yn)

    }from model (1). To this end, we generate bootstrap errors

    i according to the equality

    i = i + anZi. (7)

    Here an denotes a positive smoothing parameter, Z1, . . ., Zn are independent, centred random

    variables (independent ofYn) with density k and 1, . . .,

    n are randomly drawn with replace-

    ment from centred residuals {1, . . ., n} defined by

    i = i 1n

    nj= 1

    j, (8)

    with i from (2). We have for the conditional expectation E[i |Yn] = 0 because of E[Zi] = 0and the centring of the residuals. Conditional on the original sample Yn, the random variables

    1, . . ., n defined in (7) are independent and identically distributed with distribution function

    Fn corresponding to the density

    fn(y) =1

    nan

    ni= 1

    k

    y i

    an

    . (9)

    Note that the smoothness of the distribution of the bootstrap residuals 1, . . ., n is a crucial

    point in the asymptotic theory. Next, we build new bootstrap observations,

    Yi = m(Xi) + i , i= 1, . . ., n, (10)

    where the regression estimator m was defined in (3). From the bootstrap sample (X1, Y

    1 ), . . .,

    (Xn, Y

    n ), we estimate the regression function by analogy with (3) and denote the estimator

    by m. Note that, for the sake of simplicity, we use the same bandwidth hn as for the estima-tion of m. Residuals from the bootstrap observations are defined by i = Y

    i m(Xi) and the

    2009 Board of the Foundation of the Scandinavian Journal of Statistics.

  • 8/7/2019 Neumeyer N. (2009). Smooth residual bootstrap for empirical processes of non-parametric regrsesion residuals

    5/25

    208 N. Neumeyer Scand J Statist 36

    empirical distribution function of 1, . . ., n is denoted by F

    n . The smooth residual bootstrap

    version of the residual empirical process Rn defined in theorem 1 is

    Rn(y) =

    n(Fn (y)

    Fn(y)) =1

    n

    n

    i= 1

    (I

    {i

    y

    }Fn(y)), y

    R. (11)

    For the following main theorem (theorem 2), we need some more assumptions.

    A.5 Let k denote a symmetric probability density function either with support R or com-

    pact support such that

    k(z)z dz = 0,

    k(z)z2 dz

  • 8/7/2019 Neumeyer N. (2009). Smooth residual bootstrap for empirical processes of non-parametric regrsesion residuals

    6/25

    Scand J Statist 36 Smooth residual bootstrap 209

    i.e. Tn =(Rn), Tn =(R

    n ). Define c

    n, by P(T

    n cn, |Yn) = and assume that the limit dis-

    tribution of Tn is continuous in its -quantile. Then we have the following result (Delgado &

    Gonzlez Manteiga (2001)), which shows that the quantiles of the distribution of Tn can be

    approximated by the conditional quantiles of Tn .

    Corollary 1

    Under the assumptions of theorem 2 with the above definitions it holds that P(Tncn, ) =+ o(1).

    4. The heteroscedastic case

    In this section, we address a heteroscedastic regression model

    Yi = m(Xi) +(Xi)i, i= 1, . . ., n, (12)

    under the following assumptions.A.9 Assumption A.1 is valid and the variance function 2 is twice continuously differen-

    tiable in (0, 1) such that infx[0, 1] 2(x) > 0.A.10 Assumption A.2 is valid and supyR |yf(y)|

  • 8/7/2019 Neumeyer N. (2009). Smooth residual bootstrap for empirical processes of non-parametric regrsesion residuals

    7/25

    210 N. Neumeyer Scand J Statist 36

    When 1, . . .,

    n are randomly drawn with replacement from 1, . . ., n, we generate bootstrap

    errors 1, . . ., n as defined in (7). Those have distribution Fn corresponding to the density fn

    defined in (9), are centred and have the conditional variance var(i |Yn) = 1 + a2n

    k(u)u2 du.

    Bootstrap observations are now by analogy with (10) defined by

    Yi = m(Xi) + (Xi)i , i= 1, . . ., n,

    where the variance function estimator 2 is defined in (13). From the new bootstrap sample

    (Xi, Y

    i ), i= 1, . . ., n, by analogy with (3), (13) we define estimators m, 2 and the residuals

    obtained from the bootstrap sample are [compare (14)],

    i =Yi m(Xi)

    (Xi). (16)

    The empirical distribution function of 1, . . ., n is denoted by F

    n and the bootstrap version

    of the empirical process is Rn =

    n(Fn Fn). We impose an additional assumption.A.11 Let E[|1|2] c

    n, 1 |Yn) =

    and we obtain an asymptotic level -test by rejecting the null hypothesis whenever tn > cn, 1

    (under the assumption that the limit distribution of Tn is continuous in its (1)-quantile,compare corollary 1). This test is also consistent because under the alternative Tn converges

    2009 Board of the Foundation of the Scandinavian Journal of Statistics.

  • 8/7/2019 Neumeyer N. (2009). Smooth residual bootstrap for empirical processes of non-parametric regrsesion residuals

    8/25

    Scand J Statist 36 Smooth residual bootstrap 211

    to infinity in probability, but Tn converges, conditional on the sample, in distribution to somerandom variable (by consistent in the following it is meant that a test has both asymtotic

    level and power converging to one).

    5.1. Goodness-of-fit tests for regression functions

    Van Keilegom et al. (2008) use the theory of residual-based empirical processes to propose

    a new goodness-of-fit test in the non-parametric heteroscedastic regression model (12). For

    some parametric class of functions, M={m :[0,1]R |}, it is of interest in many appli-cations whether the regression function m belongs to that class, and we define hypotheses as

    H0 : mM versus H1 : m |M. Van Keilegom et al. (2008) suggest measuring the discrepancybetween null hypothesis and alternative by considering the difference between the empiri-

    cal distribution functions of parametrically and non-parametrically estimated residuals,

    respectively. To be more specific, let denote the least squares estimator, which minimizesni= 1(Yim(Xi))2 and estimates the value 0 that minimizes E[(m(X1)m(X1))2]. Under

    H0, 0 coincides with the true parameter 0. Theorem 2.1 in Van Keilegom et al. (2008)

    shows that the null hypothesis H0 holds if and only if the error 1 has the same distribution as

    0 = (Y1 m0 (X1))/(X1). Therefore, a consistent test can be based on the difference of theresidual-based empirical distribution function Fn and F0n, the empirical distribution function

    of parametric residuals 01, . . ., 0n defined as 0i = (Yi m0(Xi))/(Xi) with non-parametricvariance estimator 2 defined in (13), and where m0 denotes a smoothed version of manalogous to Hrdle & Mammen (1993). Under the heteroscedastic non-parametric regression

    model (12) with our assumptions A.3, A.4, A.9, A.10 and some additional regularity

    conditions Van Keilegom et al. (2008) prove that, under the null hypothesis H0, the process

    n(Fn(y) Fn0(y)), yR, converges weakly to f(y)W, where W is a centred normal randomvariable with variance

    var(W) = E

    1 fX(x)

    (x)

    m(x)

    T

    = 0

    dx1m(X1)

    = 0

    (X1)

    2 , (17)and

    = E

    m(X1)

    = 0

    m(X1)

    T

    = 0

    .

    As test statistics for testing for a parametric form of the regression function, a Kolmogorov

    Smirnov (KS) and a Cramrvon Mises (CvM) functional of

    n(Fn F0n) are proposed.Because the limit distribution depends on unknown features, the aforementioned authors

    suggest using a smooth residual bootstrap explained below, and show in a simulation study

    that the new tests are at least competitive with the methods proposed by Hrdle & Mammen

    (1993) and Stute (1997). However, a theoretical justification for consistency of the proposed

    resampling procedure is not given, and we will now investigate how the theoretical results

    obtained in section 4 of the present work are applicable in this context.

    We build bootstrap observations (Xi, Y

    i ), i= 1, . . ., n, similarly to the heteroscedastic

    smooth residual bootstrap as defined in section 4. As explained in the introduction of thecurrent section these observations should fulfil the null hypothesis of a parametric regression

    function m for some , therefore they are defined as Yi = m(Xi) + (Xi)i , where i hasdensity fn defined in (9), but based on standardized residuals given in (15). Let F

    n and F

    n0 be

    defined in the same way as Fn and Fn0, but based on the bootstrap sample (Xi, Y

    i ), i= 1, . . ., n.

    2009 Board of the Foundation of the Scandinavian Journal of Statistics.

  • 8/7/2019 Neumeyer N. (2009). Smooth residual bootstrap for empirical processes of non-parametric regrsesion residuals

    9/25

    212 N. Neumeyer Scand J Statist 36

    With a very similar argumentation as given in the Appendix we can prove the next theorem.

    Details of the proof are omitted for the sake of brevity, but can be found in Neumeyer (2006).

    Theorem 5Assume model(12) is valid under assumptions of theorem 4. We further assume that assumptions

    A.1A.6 by Van Keilegom et al. (2008) are valid when replacing 0 by 0. Then,

    conditional on the sample Yn, the process

    n(Fn (y) Fn0(y)), yR, converges weakly to f(y)W ,in probability, where W is a centred normal random variable with variance defined in (17).

    Note that the assumptions by Van Keilegom et al. (2008) were only formulated under the

    null hypothesis, i.e. for 0 = 0. Theorem 5 is valid both under the null hypothesis and under

    fixed alternatives. This means that a bootstrap version of, for example, a KS type test statistic

    as explained in the introduction of this section always approximates the (1 )-quantile ofthe test under H0. Hence, when for the observed data H0 is valid, the probability of rejection

    will be approximately . But when for the observed data the alternative is valid, with increas-ing n the test statistic will increase to infinity, whereas the bootstrap quantile converges to

    a fixed value, such that the test will reject with high probability. Theorem 5 shows that the

    bootstrap procedure suggested by Van Keilegom et al. (2008) to test for a parametric form

    of a regression function is consistent.

    5.2. Testing for a parametric variance function

    In the present work, we have considered homoscedastic as well as heteroscedastic regression

    models. Testing for heteroscedasticity has attracted some attention in non-parametric regres-

    sion literature, and recently, Dette et al. (2008) proposed a test for a parametric form of thevariance function, which is similar in spirit to the goodness-of-fit tests by Van Keilegom et al.

    (2008) discussed before. Methods of the present paper were analogously applied to obtain

    consistency of the bootstrap version of Dette et al.s (2007) test.

    5.3. Two-sample problems

    In this section, suppose we have two independent non-parametric regression models (i= 1,2)

    of type (12),

    Yij = mi(Xij) +i(Xij)ij, j= 1, . . ., ni,

    which are to be compared. Tests for equality of the two regression functions m1 and m2 with

    hypothesis H0 : m1 = m2 versus H1 : m1 /= m2 have attracted a lot of attention during the last

    couple of decades. Recently, Pardo Fernndez et al. (2007) proposed a test based on the differ-

    ence of empirical distribution functions of residuals estimated under the null hypothesis and

    under the alternative, respectively. Here, under H0 a kernel regression estimator m0 based on

    all observations (Xij, Yij), j= 1, . . ., ni, i= 1, 2, is applied. The weak convergence of the pro-

    cess is shown and local alternatives are discussed. In a simulation study, these authors use a

    smooth residual bootstrap. It is important to note that the new bootstrap observations have

    to fulfil the null hypothesis, therefore the pooled estimator m0 of the regression function is

    used to define bootstrap samples by

    Yij = m0(Xij) + i(Xij)ij, j= 1, . . ., ni,

    where for each i{1, 2}, i1, . . ., ini denotes a random sample generated from a kernel densityestimator based on residuals i1, . . ., ini [standardized versions of ij = (Yij mi(Xij))/i(Xij),

    2009 Board of the Foundation of the Scandinavian Journal of Statistics.

  • 8/7/2019 Neumeyer N. (2009). Smooth residual bootstrap for empirical processes of non-parametric regrsesion residuals

    10/25

    Scand J Statist 36 Smooth residual bootstrap 213

    analogous to (15)]. From the new samples (Xij, Y

    ij) test statistics are built in the same way

    as from the initial samples. With methods of the presented work we can give the theoretical

    justification for the consistency of Pardo Fernndez et al.s (2007) procedure.

    Speckman et al. (2001) suggested a test for equality of regression functions against one-

    sided alternatives H1 : m1 > m2 based on ranking the residuals that was further investigated

    by Neumeyer & Dette (2005). The latter authors give the asymptotic distribution of the test

    statistic and also prove the consistency of a symmetric wild bootstrap under the additional

    restrictive assumption of a symmetric error distribution. Stronger results which show that

    the smooth residual bootstrap is consistent without the symmetry assumption on the error

    distribution can be deduced similarly to the methods in the present work.

    A test for equality of error distributions in two regression models can be based on KS

    or CvM tests based on the differences of empirical distribution functions of residuals in the

    two samples, respectively. A smooth residual bootstrap version of such a test was investigated

    by Mora & Neumeyer (2005) for parametric regression models. Here the bootstrap observa-

    tions are built under the null hypothesis of equal error distributions and are generated from

    a smooth distribution estimated using all residuals from both samples. For non-parametric

    regression models similar results can be obtained from the methods explained in the present

    work, compare Pardo-Fernndez (2007) and Neumeyer (2006).

    6. Finite sample performance

    In this section, we present some simulation results. In particular, we investigate how sensitive

    the smooth residual bootstrap procedures are with respect to the choice of the bandwidth an.

    We approximate the (1) quantile c1 of the distribution of the statistic Tn =(

    n(FnF))for the functionals

    (Z) = supyR

    |Z(y)| and (Z) =

    Z2 dFn

    corresponding to KS and CvM statistics, respectively, by cn, 1, the (1)B-th order sta-tistic of bootstrap statistics T, 1n , . . ., T

    , Bn . Compare Beran et al. (1987) for consistency of this

    method. Here the bootstrap statistic is Tn =(

    n(Fn Fn)). It is counted how often the ori-ginal value Tn is larger than c

    n, 1, which should be approximately in 100% of the runs. We

    display results for {0.025, 0.05,0.1}. For the estimation of the regression function, we usea NadarayaWatson estimator with Gaussian kernel and simple rule-of-thumb bandwidth,

    where we set hn = (s2/n)0.3. The rate corresponds to the bandwidth conditions A.4. Here, s2

    denotes a simple estimator for the variance (or, in the heteroscedastic model, the integratedvariance function), see Rice (1984a). Note that we have employed a local linear estimator

    with bandwidth choice by plug-in methodology as described by Ruppert et al. (1995). Those

    results are not displayed, because we observed that the values obtained from the two

    methods of estimating m are quite similar. For the bandwidth an used for smoothing the boot-

    strap errors, we present results for the choices (n1/4)/2 and n1/4. We estimate the quantilesfrom B= 200 bootstrap replications in each of 1000 simulation runs. We consider various

    settings corresponding to models (1) and (12) with constant or non-constant variance, with

    different regression functions, error distributions and sample sizes. The first three settings

    are according to model (1) with uniform design, N(0,0.5) distributed errors and a rather

    small sample size, i.e. n = 50, where we consider different regression functions, they are (i)m(x) = 12x, (ii) m(x) = ex, and (iii) m(x) = 2x2. The results are displayed in Table 1 for theCvM statistic and KS statistic.

    We have also implemented the classical residual bootstrap, i.e. with smoothing parameter

    an = 0 (where Fn is the empirical distribution function of centred residuals), to demonstrate

    2009 Board of the Foundation of the Scandinavian Journal of Statistics.

  • 8/7/2019 Neumeyer N. (2009). Smooth residual bootstrap for empirical processes of non-parametric regrsesion residuals

    11/25

    214 N. Neumeyer Scand J Statist 36

    Table 1. Approximate level of the (a) Cramervon Mises and (b) KolmogorovSmirnov test

    with asymptotic level in the homoscedastic models (i)(iii). The sample size is n = 50 and

    the values for bandwidth an are approximately 0.188, 0.376 and 0, respectively

    Model (i) Model (ii) Model (iii)

    Test an\ 0.025 0.05 0.1 0.025 0.05 0.1 0.025 0.05 0.1(a) 12 n

    1/4 0.022 0.051 0.101 0.017 0.054 0.095 0.025 0.054 0.108n1/4 0.025 0.046 0.101 0.025 0.057 0.105 0.027 0.057 0.1110 0.002 0.007 0.028 0.007 0.015 0.044 0.010 0.015 0.033

    (b) 12 n1/4 0.021 0.040 0.091 0.014 0.029 0.066 0.016 0.037 0.070

    n1/4 0.032 0.049 0.112 0.022 0.043 0.105 0.018 0.039 0.0930 0.002 0.006 0.016 0.000 0.004 0.018 0.000 0.007 0.011

    Table 2. Approximate level of the (a) Cramervon Mises and (b) KolmogorovSmirnov test

    with asymptotic level in the homoscedastic models (iv)(vi). The sample size is n = 100 and

    the values for the smoothing parameter an are approximately 0.158, 0.316 and 0, respectively

    Model (iv) Model (v) Model (vi)

    Test an\ 0.025 0.05 0.1 0.025 0.05 0.1 0.025 0.05 0.1(a) 12 n

    1/4 0.025 0.054 0.108 0.047 0.078 0.123 0.026 0.059 0.100n1/4 0.020 0.049 0.096 0.051 0.094 0.149 0.026 0.050 0.1080 0.007 0.017 0.062 0.029 0.049 0.089 0.016 0.032 0.057

    (b) 12 n1/4 0.008 0.033 0.073 0.027 0.051 0.087 0.016 0.030 0.065

    n1/4 0.021 0.056 0.127 0.036 0.059 0.113 0.022 0.040 0.0900 0.002 0.006 0.028 0.016 0.022 0.039 0.003 0.008 0.019

    that this method should not be applied in the context of residual-based empirical distribution

    functions. Results are presented in Table 1 (and the following tables) and show in all cases

    that quantiles are strongly underestimated by the classical residual bootstrap.

    In settings (iv)(vi) we consider different error distributions, i.e. (iv) Students tdistribution

    with three degrees of freedom, (v) centred double exponential distribution with variance 0.5,

    and (vi) centred skew-normal distribution with scale parameter 1 and shape parameter 1.5.

    The distributions are scaled such that the variance is 0.5. The regression functions are m(x) = x

    for (iv) and (v) and m(x) = 2x2 for (vi). The sample size is n = 100. In Table 2 we display the

    results for settings (iv)(vi). Here as well as in Table 1 and in simulations not displayed we

    observe that the KS statistic reacts a bit more sensitively to too small choices of the bootstrapsmoothing parameter an than the CvM statistic. All in all we observe good approximations

    of the quantiles in settings (i)(iv) and (vi), whereas for the double exponential distribution

    (v) the quantiles are overestimated when applying the CvM statistic. The cases an = 0 in

    Table 2 again demonstrate that the classical residual bootstrap gives too low approximations

    for the quantiles [a notable exception is the CvM statistic in setting (v)].

    Moreover we give a few results for the heteroscedastic model (12) according to the settings

    (vii)(ix) defined below. Note that here the variances are larger than in the homoscedastic

    settings. In simulations not presented here we have observed that the heteroscedastic pro-

    cedure reacts more sensitively to the choice of the smoothing parameter an, and smaller values

    of an lead to more accurate results. We display the performance for an = (n1/4

    )/2 and n1/4

    inTable 3. The models chosen have standard normal errors and regression function m(x) = 2x2,

    where in (vii) procedures according to the heteroscedastic model (12) are applied, but the vari-

    ance function is constant 2 1 and n = 50. Model (viii) has variance function 2(x) = ex/2and n = 100, model (ix) has 2(x) = (1 + x)2/2 and n = 100.

    2009 Board of the Foundation of the Scandinavian Journal of Statistics.

  • 8/7/2019 Neumeyer N. (2009). Smooth residual bootstrap for empirical processes of non-parametric regrsesion residuals

    12/25

    Scand J Statist 36 Smooth residual bootstrap 215

    Table 3. Approximate level of the (a) Cramervon Mises and (b) KolmogorovSmirnov test

    with asymptotic level in the heteroscedastic models (vii)(ix). The sample size in (vii) is

    n = 50 with values 0.188, 0.376 and 0 for an, the sample size in (viii) and (ix) is n = 100 with

    values 0.158, 0.316 and 0 for an, respectively

    Model (vii) Model (viii) Model (ix)

    Test an\ 0.025 0.05 0.1 0.025 0.05 0.1 0.025 0.05 0.1(a) 12 n

    1/4 0.016 0.043 0.104 0.021 0.047 0.089 0.020 0.044 0.085n1/4 0.024 0.051 0.086 0.030 0.052 0.109 0.032 0.060 0.0990 0.001 0.005 0.020 0.008 0.020 0.041 0.007 0.018 0.042

    (b) 12 n1/4 0.028 0.043 0.074 0.025 0.049 0.093 0.021 0.051 0.101

    n1/4 0.020 0.045 0.104 0.043 0.071 0.113 0.033 0.058 0.1010 0.001 0.003 0.012 0.005 0.012 0.027 0.001 0.009 0.023

    Overall, for the considered cases (i)(ix) we can draw the conclusion that the performanceof the smooth residual bootstrap in the context of residual-based empirical processes does not

    depend on the choice of the small positive smoothing parameter an very heavily, but an = 0

    (the classical residual bootstrap) should not be used.

    For simulations of the finite sample properties of the smooth residual bootstrap for testing

    goodness-of-fit of the regression and variance functions and testing the equality of regression

    functions as explained in section 4 we refer to Dette et al. (2007), Pardo Fernndez et al.

    (2008) and Van Keilegom et al. (2008). In all these publications good approximations of the

    test level and good detection of alternatives were observed. Simulations for tests for equal-

    ity of error distributions and for symmetry of error distributions can be found in Neumeyer

    (2006) but are not displayed here for the sake of brevity.

    7. Concluding remarks

    For a few years only there have existed asymptotic results on the estimation of the distri-

    bution of unobserved errors in non-parametric regression models. Weak convergence of the

    empirical process based on non-parametrically estimated residuals was shown by Akritas &

    Van Keilegom (2001) and recently those results have been successfully applied for model speci-

    fication tests by Dette et al. (2007), Pardo Fernndez (2007), Pardo Fernndez et al. (2007)

    and Van Keilegom et al. (2008), among others. In all the problems mentioned, the asymptotic

    distribution of the test statistics depends on unknown features of the data-generating processsuch as the error density, and hence applying bootstrap versions of the tests may be of advan-

    tage. It turned out that when using the empirical process based on residuals a smooth residual

    bootstrap should be used as smoothness of the error distribution is an essential assumption

    needed to obtain weak convergence of the empirical processes (see remark 1 in Appendix A).

    In the paper at hand, we presented asymptotic validity of the smooth residual bootstrap for

    procedures based on the empirical process of residuals and discussed applicability to several

    testing problems.

    Other interesting results presented (see Appendix A) are the uniform consistency of the

    derivative of the residual-based kernel density estimator considered by Cheng (2004) and

    uniform consistency of the kernel regression estimator as well as its derivative based on abootstrap sample obtained from the smooth residual bootstrap.

    With an application of the results presented here in the subsequent work by Neumeyer

    (2008) it is shown that when using a smooth version of the residual-based empirical process

    the smoothness of the bootstrap error distribution is not necessary and a classical residual

    2009 Board of the Foundation of the Scandinavian Journal of Statistics.

  • 8/7/2019 Neumeyer N. (2009). Smooth residual bootstrap for empirical processes of non-parametric regrsesion residuals

    13/25

    216 N. Neumeyer Scand J Statist 36

    bootstrap can be applied. In the mentioned paper also, simulations are presented that

    compare the power of different bootstrap methods in the context of specification tests con-

    sidered here.

    Akritas & Van Keilegoms (2001) results were only presented for univariate covariates and

    hence so were the results here. A generalization of Akritas & Van Keilegoms (2001) results

    was recently developed by Neumeyer & Van Keilegom (2008). Although their proof is not a

    straightforward generalization of the proof in the one-dimensional case, those results suggest

    that validity the bootstrap theorems given here can be obtained as well for the case of mul-

    tivariate covariates.

    Acknowledgements

    This paper contains results of the authors Habilitationsschrift [Neumeyer (2006)]. She would

    like to thank Holger Dette for his great support and encouragement over the years. A major

    part of the Habilitationsschrift was written while the author visited the Australian National

    University and she is very grateful to the members of the Mathematical Sciences Institute,

    especially Peter Hall, for their great hospitality. The visit was supported by a research grant of

    the Deutsche Forschungsgemeinschaft. The author further would like to thank two unknown

    referees for their constructive comments on an earlier version of this manuscript.

    References

    Akritas, M. & Van Keilegom, I. (2001). Non-parametric estimation of the residual distribution. Scand.

    J. Statist. 28, 549567.

    Bai, J. (1994). Weak convergence of the sequential empirical processes of residuals in ARMA models.Ann. Statist. 22, 20512061.

    Beran, R. J., Le Cam, L. & Millar, P. W. (1987). Convergence of stochastic empirical measures.

    J. Multivariate Anal. 23, 159168.

    Boldin, M. V. (1982). Estimation of the distribution of noise in an autoregression scheme. Theory Probab.

    Appl. 27, 866871.

    Cheng, F. (2002). Consistency of error density and distribution function estimation in nonparametric

    regression. Statist. Probab. Lett. 59, 257270.

    Cheng, F. (2004). Weak and strong uniform consistency of a kernel error density estimator in nonpara-

    metric regression. J. Statist. Plann. Inference 119, 95107.

    De Angelis, D., Hall, P. & Young, G. A. (1993). Analytical and bootstrap approximations to estimator

    distributions in L1 regression. J. Amer. Statist. Assoc. 88, 13101316.

    Delgado, M. A. & GonzlezManteiga, W. (2001). Significance testing in nonparametric regression basedon the bootstrap. Ann. Statist. 29, 14691507.

    Dette, H., Neumeyer, N. & Van Keilegom, I. (2007). A new test for the parametric form of the variance

    function in nonparametric regression. J. Roy. Statist. Soc. Ser. B Statist. Methodol. 69, 903917.

    Durbin, J. (1973). Weak convergence of the sample distribution function when parameters are estimated.

    Ann. Statist. 1, 279290.

    Hrdle, W. & Bowman, A. W. (1988). Bootstrapping in nonparametric regression: local adaptive smooth-

    ing and confidence bands. J. Amer. Statist. Assoc. 83, 102110.

    Hrdle, W. & Mammen, E. (1993). Comparing nonparametric versus parametric regression fits. Ann.

    Statist. 21, 19261947.

    Karunamuni, R. J. & Zhang, S. (2007). Some improvement on a boundary corrected kernel density esti-

    mator. Statist. Probab. Lett. 78, 499507.

    Koul, H. L. (1970). Some convergence theorems for ranks and weighted empirical cumulatives. Ann.Math. Statist. 41, 17681773.

    Koul, H. L. (2002). Weighted empirical processes in dynamic nonlinear models, 2nd edn. Springer,

    New York.

    Koul, H. L. & Lahiri, S. N. (1994). On bootstrapping M-estimated residual processes in multiple linear

    regression models. J. Multivariate Anal. 49, 255265.

    2009 Board of the Foundation of the Scandinavian Journal of Statistics.

  • 8/7/2019 Neumeyer N. (2009). Smooth residual bootstrap for empirical processes of non-parametric regrsesion residuals

    14/25

    Scand J Statist 36 Smooth residual bootstrap 217

    Koul, H. L. & Levental, S. (1989). Weak convergence of the residual empirical process in explosive

    autoregression. Ann. Statist. 17, 17841794.

    Loynes, R. M. (1980). The empirical distribution function of residuals from generalized regression. Ann.

    Statist. 8, 285298.

    Mammen, E. (1996). Empirical process of residuals for high-dimensional linear models. Ann. Statist. 24,307335.

    Mora, J. & Neumeyer, N. (2005). The two-sample problem with regression errors: an empirical process

    approach. Preprint. http://www.math.uni-hamburg.de/home/neumeyer/paperjm.pdf

    Mller, H.-G. (1984). Boundary effects in nonparametric curve estimation. Compstat 1984. Physica Verlag,

    Heidelberg, 8489.

    Mller, U., Schick, A. & Wefelmeyer, W. (2007). Estimating the error distribution function in semipara-

    metric regression. Statist. Decisions. 25, 118.

    Neumeyer, N. (2006). Bootstrap procedures for empirical processes of nonparametric residuals. Habilita-

    tionsschrift, Ruhr-Universitt Bochum. http://www.math.uni-hamburg.de/home/neumeyer/habil.ps

    Neumeyer, N. (2008). A bootstrap version of the residual-based smooth empirical distribution function.

    J. Nonparametr. Statist. 20, 153174.

    Neumeyer, N. & Dette, H. (2005). A note on one-sided nonparametric analysis of covariance by ranking

    residuals. Math. Methods Statist. 14, 80104.

    Neumeyer, N. & Van Keilegom, I. (2008). Estimating the error distribution in nonparametric

    multiple regression with applications to model testing. Preprint. http://www.math.uni-hamburg.de/

    research/ims.html

    PardoFernndez, J. C. (2007). Comparison of error distributions in nonparametric regression. Statist.

    Probab. Lett. 77, 350356.

    PardoFernndez, J. C., Van Keilegom, I. & Gonzlez-Manteiga, W. (2007). Testing for the equality of

    k regression curves. Statist. Sinica 17, 11151137.

    Pollard, D. (1984). Convergence of stochastic processes. Springer, New York.

    Portnoy, S. (1986). Asymptotic behavior of the empiric distribution of M-estimated residuals from a

    regression model with many parameters. Ann. Statist. 14, 11521170.

    Rice, J. (1984a). Bandwidth choice for nonparametric regression. Ann. Statist. 12, 12151230.

    Rice, J. (1984b). Boundary modification for kernel regression. Comm. Statist. Theory Methods 13,

    893900.

    Ruppert, D., Sheather, S. J. & Wand, M. P. (1995). An effective bandwidth selector for local least squares

    regression. J. Amer. Statist. Assoc. 90, 12571270.

    Schick, A. & Wefelmeyer, W. (2002). Estimating the innovation distribution in nonlinear autoregressive

    models. Ann. Inst. Statist. Math. 54, 245260.

    Speckman, P. L., Chiu, J.-E., Hewett, J. E. & Bertelson, S. E. (2001). A one-sided test adjusting for

    covariates by ranking residuals following smoothing. Technical report. http://www.stat.missouri.edu/

    speckman/pub.htmlStute, W. (1997). Nonparametric model checks for regression. Ann. Statist. 25, 613641.

    van der Vaart, A. W. (1998). Asymptotic statistics. Cambridge University Press, Cambridge.

    van der Vaart, A. W. & Wellner, J. A. (1996). Weak convergence and empirical processes. Springer,

    New York.Van Keilegom, I., Gonzlez-Manteiga, W. & Snchez Sellero, C. (2008). Goodness-of-fit tests in para-

    metric regression based on the estimation of the error distribution. Test 17, 401415.

    Wu, C.-F. J. (1986). Jackknife, bootstrap and other resampling methods in regression analysis. Ann. Stat-

    ist. 14, 12611350.

    Received November 2007, in final form July 2008

    NatalieNeumeyer, Center of Mathematical Statisticsand StochasticProcesses,Department of Mathematics,

    University of Hamburg, Bundesstrasse 55, 20146 Hamburg, Germany.

    E-mail: [email protected]

    Appendix A: Auxiliary results

    For the sake of brevity, some proofs are omitted and some are shown in rather condensed

    form. However, all proofs can be found in full detail in Neumeyer (2006).

    2009 Board of the Foundation of the Scandinavian Journal of Statistics.

  • 8/7/2019 Neumeyer N. (2009). Smooth residual bootstrap for empirical processes of non-parametric regrsesion residuals

    15/25

    218 N. Neumeyer Scand J Statist 36

    Different presentations of the process defined in (11) like

    Rn (y) =1

    n

    n

    i= 1(I{i y + m(Xi) m(Xi)} Fn(y))

    =1

    n

    ni= 1

    (I{Ui Fn(y + m(Xi) m(Xi))} Fn(y)) (18)

    will turn out to be useful in the proof of theorem 2. Here, we use the representation i =

    F1

    n (Ui), where U1, . . ., Un U[0, 1] are independent and uniformly distributed random vari-ables in [0, 1] independent of the sample Yn. It should be mentioned that by definition

    Ui = Un, i = Fn(i ) the Un, 1 , . . ., Un, n depend on Yn and form a triangular array, because with

    each n the whole tupel changes. However, it is easy to see that for each n the tupel (Un, 1 , . . .,

    Un, n) has the same distribution as (U1, . . ., Un), where U1, U2, . . . denotes a sequence of inde-

    pendent and identically U[0, 1] distributed random variables. In particular, the distribution

    does not depend on Yn. In results and proofs where we are only interested in convergence in

    distribution we can therefore use the representation (18).

    In the next remark, we explain the main arguments of the proof.

    Remark 1. Note that the stochastic process Rn in (18) is not an empirical process as thesummands are dependent. Not only do the random functions m and Fn depend on the whole

    original sample Yn, the function m also depends on the bootstrap sample {(Xi, Yi ) | i=

    1, . . ., n}. To be able to apply asymptotic theory for empirical processes, we will replace therandom function Fn by a fixed function HH, and the random function m m by a fixed

    function gG

    and consider the process Rn as an empirical process indexed in yR

    , HH

    ,GG.Here on one hand the function classes H and G need to be defined suitably such that they

    are large enough to contain Fn and m m, respectively, with probability converging to one.

    To show this the auxiliary results in sections A.1 and A.2 are needed (and the function Fn

    has to be smooth, i.e. the residual bootstrap has to be smooth).

    The process Rn then has the structure of an empirical process indexed by functions of theform (x, u) I{uH(y +g(x))}, yR, HH, GG. Hence, on the other hand the coveringnumbers of the function classes H and G need to be small enough to prove that the process

    is Donsker. Theory for this is provided in section A.3.

    With these results we are able to prove the asymptotic expansion of the process Rn

    as given

    in lemma 1(i) by applying weak convergence theory for empirical processes indexed by func-

    tion classes. The derivation of lemma 1(ii) then applies a Taylor expansion ofFn and results

    of section A.2 (smoothness of Fn is used here again).

    Applying the expansion of the process Rn as given in lemma 1(ii), weak convergence canbe shown by asymptotic equicontinuity and fidi convergence of the dominating process. To

    do so we apply again auxiliary results from section A.1.

    A.1. Regression estimation and residual-based density estimation

    This section gives some results about uniform convergence of kernel density and regression

    estimators based on the original sample. Recall the definitions of m and fX in (3) and (4),

    respectively. Let mn, mn and fX, n be defined as

    mn(x) =

    wn(y, x)m(y) fX(y) dy

    1

    fX, n(x), (19)

    2009 Board of the Foundation of the Scandinavian Journal of Statistics.

  • 8/7/2019 Neumeyer N. (2009). Smooth residual bootstrap for empirical processes of non-parametric regrsesion residuals

    16/25

  • 8/7/2019 Neumeyer N. (2009). Smooth residual bootstrap for empirical processes of non-parametric regrsesion residuals

    17/25

    220 N. Neumeyer Scand J Statist 36

    where tni = I{Xi < hn}+ I{Xi > 1hn} is non-negative with moments E[tlni] = O(hn) for all l> 0,Cni = (mn(Xi)m(Xi))/hn fulfils maxi= 1, , n |Cni|= O(1) almost surely by (22), whereas for

    zni = m(Xi)mn(Xi) +(mn(Xi)m(Xi))I{Xi [hn, 1hn]}we have

    |zni| supx[0, 1]

    |m(x)mn(x)|+ supx[hn, 1hn]

    |mn(x)m(x)|= o(n) + O(h2n)

    almost surely for sequences n that fulfil the conditions of proposition 2.

    In the following, we give results about uniform convergence of the kernel density estimator

    fn defined in (9) and its distribution function Fn, applied in the proof of theorem 2.

    Lemma 2

    Under model (1) with assumptions A.1A.6, we have almost surely

    (i) supyR |fn(y) f(y)|= o(a4/3n ),(ii) sup

    y, zR

    |fn(y) f(y) fn(z) + f(z)||y z|/2 = o(1), and

    (iii) supyR

    |Fn(y)F(y)|= o(1). Here the constant is defined in (5).

    Proof. We have for fn defined in (9) by Taylors expansion,

    fn(y) = fn(y) +

    1

    = 1U()n (y) + Vn(y),

    where U()n and Vn will be defined in a moment and fn denotes the kernel density estimator

    defined as fn, but based on the true errors i instead of residuals i. The assertion supyR |fn(y)f( y)|= o(a4/3n ) can be shown applying theorem 37 by Pollard (1984, p. 34).

    Further we have

    U()n (y) =1

    na+ 1n

    ni= 1

    k()

    y ian

    (i i) (1)

    !, (25)

    Vn(y) =1

    na+ 1n

    n

    i= 1k()(y, i, n)(i i)

    (1)!

    . (26)

    From the definition of i in (8) and i in (2) it follows that

    i i =1n

    nj= 1

    j + zni 1n

    nj= 1

    znj + hnCnitnihn 1n

    nj= 1

    Cnjtnj, (27)

    see the text below (24) for the definitions and convergence rates of zni, Cni and tni. Here we

    choose n = cn(log h1n /(nhn))

    1/2 for some sequence cn that will be specified later.Applying BorelCantellis lemma it can be shown that hnn

    1 nj= 1 Cnjtnj = O(h

    2n) almost

    surely. Further note that ((log log n)/n)1/2 = o(n) and h2n

    = o(n). Hence, by the law of iterated

    logarithm for = 1, . . .,1, we have

    supyR

    |U()n (y)|o(n)

    1a+ 1n

    supyR

    |W()n (y)|+ 1a+ 1nsupyR

    Ek() y 1an

    + O(hn)

    1

    a+ 1nsupyR

    |W()n (y)|+1

    a+ 1nsupyR

    E

    tn1k()

    y 1

    an

    ,

    2009 Board of the Foundation of the Scandinavian Journal of Statistics.

  • 8/7/2019 Neumeyer N. (2009). Smooth residual bootstrap for empirical processes of non-parametric regrsesion residuals

    18/25

    Scand J Statist 36 Smooth residual bootstrap 221

    where

    W()n (y) =1

    n

    n

    i= 1k()

    y i

    an

    E

    k()

    y i

    an

    ,

    W()n (y) =1

    n

    ni= 1

    tnik

    ()

    y i

    an

    E

    tn1k

    ()

    y i

    an

    .

    Further we have

    supyR

    E[|k()((y 1)/an)|] = O(an), E[tn1] = O(hn),

    and a1n E[k()((y i)/an)] converges to f()(y) and is uniformly of order O(1). One can

    apply theorem 37 by Pollard (1984, p. 34) to obtain rates for the almost sure convergence

    of supyR |W()n (y)| and supyR |W()n (y)| to zero such that from (25) and the results before itfollows that supyR |U()n (y)|= o(a4/3n ) (= 1, . . ., 1) almost surely by the bandwidth con-ditions A.6.

    Estimation of the remainder term Vn(y) defined in (26) uniformly in y R is straight-forward using (27) to obtain the bound (o(

    n ) + O(h

    + 1n ))/a

    + 1n

    = o(a4/3n ) almost surely for

    cn =(nhna2 + 2/+ 8/(3)n /(log h

    1n ))

    1/2 by our bandwidth conditions. Hence, assertion (i)follows.

    The proof of (ii) follows by (i), a straightforward estimation for the case |y z|> a8/3n andby a first order Taylor expansion for k (in the definition of fn) in the case |y z|a8/3n .

    Assertion (iii) follows from (i) with help of the dominated convergence theorem. Details

    are omitted for the sake of brevity.

    Remark 2. Note that from the proof it follows that the last two bandwidth conditions in

    A.6 are redundant when k()0 and can be replaced by the simpler condition h2+ 2n = O(a21n )for = 1, . . .,1.

    A.2. Bootstrap-based regression estimation

    We give results about convergence in probability of the bootstrap version of the regression

    estimator m and its derivative. These findings will be used in the proof of theorem 2.

    Lemma 3

    Under model (1) with assumptions A.1A.7 it holds that

    (i) supx[0, 1]

    |m(x) mn(x)|= op(1),(ii) sup

    x[0, 1]|m(x) mn(x)|= op(1), and

    (iii) supx, t[0, 1]

    |m(x)mn(x)m(t) + mn(t)||xt|/2 = op(1). Here denotes the constant from (5) and mn is

    defined in (20).

    Proof. To prove (i) we use the definition (7) of the bootstrap residuals. The bootstrap

    observations (10) have the representation Yi = m(Xi) + i + anZi and therefore

    m(x) mn(x) = An(x) + Bn(x) + Cn(x), (28)

    2009 Board of the Foundation of the Scandinavian Journal of Statistics.

  • 8/7/2019 Neumeyer N. (2009). Smooth residual bootstrap for empirical processes of non-parametric regrsesion residuals

    19/25

    222 N. Neumeyer Scand J Statist 36

    where

    An(x) =1

    fX(x)

    1

    n

    ni= 1

    wn(Xi, x)( m(Xi)mn(Xi)), (29)

    Bn(x) =1

    fX(x)

    1

    n

    ni= 1

    wn(Xi, x)(mn(Xi) + anZi) mn(x), (30)

    Cn(x) =1

    fX(x)

    1

    n

    ni= 1

    wn(Xi, x)i. (31)

    The first term can be uniformly in x [0, 1] bounded by supx[0, 1] |m(x) mn(x)|=o(h1 +n ) = o(1) almost surely by proposition 2. The second term, Bn(x), is the difference

    between the NadarayaWatson estimator from a regression model with regression function

    mn

    and (centred) errors an

    Zi

    and mn

    and therefore also converges to zero uniformly almost

    surely analogous to proposition 2. The last term, Cn(x) defined in (31), is more difficult

    to handle. Let (1), . . .,(n) be independent random variables with uniform distribution on

    {1, . . ., n} corresponding to the drawing of 1, . . ., n with replacement from 1, . . ., n, such that

    i

    =

    nl= 1

    I{(i) = l}l =n

    l= 1

    I{(i) = l}l 1n

    nj= 1

    j.

    This yields

    Cn(x) =1

    fX(x)

    1

    nhn

    n

    i

    =1

    wn(Xi, x)

    n

    l

    =1

    I{(i) = l}l 1n

    n

    j

    =1

    j

    , (32)

    and we have

    supx[0, 1]

    |Cn(x)|2 supx[0, 1]

    |m(x)m(x)| (33)

    + supx[0, 1]

    1

    fX(x)

    1

    nhn

    ni= 1

    wn(Xi, x)

    n

    l= 1

    I{(i) = l}l 1n

    nj= 1

    j

    , (34)

    where the term on the right-hand side of (33) is o(1) a.s. andn

    l= 1 I{(i) = l}= 1 for alli= 1, . . ., n. For (34) we only consider the numerator because the denominator converges

    uniformly almost surely to the design density, which is by assumption bounded away fromzero. With the notations

    ni =

    nl= 1

    I{(i) = l}l and n = 1n

    ni= 1

    i (35)

    we have to prove that for all > 0

    P

    sup

    x[0, 1]

    1nhnn

    i= 1

    K

    xXi

    hn

    (ni n)

    >

    = o(1),

    where, conditional on the sample Yn, the ni are randomly drawn with replacement from

    {1, . . ., n}. Note that maxi= 1, , n |i|= op(n1/4

    ) and by the strong law of large numbers thereexists a constant A such that P(An) = o(1), where An denotes the event that maxi= 1, , n |i|>An1/4 or n1

    ni= 1

    2i > A. Furthermore, we assume without restriction that the constant A

    is a bound for the kernel K and its derivative K and fulfils

    K2((x y)/hn)fX(y) dy/hn Afor all x [0, 1]. We assume in the following that Acn occurs. For a fixed x [0,1] define

    2009 Board of the Foundation of the Scandinavian Journal of Statistics.

  • 8/7/2019 Neumeyer N. (2009). Smooth residual bootstrap for empirical processes of non-parametric regrsesion residuals

    20/25

    Scand J Statist 36 Smooth residual bootstrap 223

    i = K(xXi

    hn)(ni n)/hn, Vn = nA2/hn and Mn = 2A2n1/4/hn. Given 1, . . ., n, the univariate

    random variables 1, . . ., n are independent such that

    E[i

    |1, . . ., n] = 0,

    |i

    |Mn,

    n

    i= 1var(i

    |1, . . ., n)

    Vn.

    Applying Bernsteins inequality we obtain

    P

    1nhnn

    i= 1

    K

    xXi

    hn

    (ni n)

    > |1, . . ., n2exp((log n)n),

    where 1n = 2(Vn + Mnn/3) log n/(n)2 = o(1) for all > 0 by the bandwidth assumption A.4.

    The sequence n is independent of x and not random.

    Next we cover [0, 1] by n intervals of length 1/n with centres xk. We have

    sup

    x[0, 1] 1

    nhn

    n

    i= 1

    KxXi

    hn

    (nin)

    maxk= 1, ..., n

    1

    nhn

    n

    i= 1

    KxkXi

    hn

    (nin)

    + supx, t[0, 1], |xt|n1

    1nhnn

    i= 1

    K

    xXi

    hn

    K

    tXi

    hn

    (ni n)

    and from the above considerations,

    P

    max

    k= 1, , n

    1nhnn

    i= 1

    K

    xkXi

    hn

    (ni n)

    > 2n exp((log n)n) = o(1)

    for all > 0. Now we estimate using the mean value theorem (still assuming that Acn occurs)

    supx, t[0, 1], |xt|n1

    1nhnn

    i= 1

    K

    xXihn

    K tXihn

    (ni n)

    2A2n1/4 1h2n

    supx, t[0, 1], |xt|n1

    |x t|= O

    n1/4

    h2nn

    = o(1)

    by assumptions A.3 and A.4. Altogether we can deduce that (34) converges to zero almost

    surely and (i) follows.

    The proofs of (ii) and (iii) follow similar argumentations and applications of proposition 1.

    A.3. More auxiliary results

    For a constant L > 0 let C1 +L [0, 1] be defined as the set of differentiable functions g :[0,1]R

    with derivatives g such that

    max

    sup

    x[0, 1]|g(x)|, sup

    x[0, 1]|g(x)|

    + sup

    x, t[0, 1]

    |g(x)g(t)||x t| L,

    see van der Vaart & Wellner (1996, p. 154). See this reference also for definitions of covering

    and bracketing numbers.

    Lemma 4

    Let H denote the class of continuously differentiable, increasing functions H:R+0 [0,1] withuniformly bounded derivative h, such that

    supz

    |h(z)|+ supz, z

    |h(z)h(z)||z z| L, (36)

    2009 Board of the Foundation of the Scandinavian Journal of Statistics.

  • 8/7/2019 Neumeyer N. (2009). Smooth residual bootstrap for empirical processes of non-parametric regrsesion residuals

    21/25

    224 N. Neumeyer Scand J Statist 36

    where the supremums are taken over R+, and |1H(x)|d/x for all xR+, where > 1 + 1/and the constant d is independent of HH. Then, for the covering number with respect to thesupremum norm, log N(,H, ) = O(1/(1 + a)) is valid, where a = ((1)1)/(1 ++ ) > 0.

    The proof of lemma 4 is similar to the proof of corollary 2.7.4 by van der Vaart & Wellner

    (1996), and is omitted for the sake of brevity.

    We define a sequence of function classes by

    Gn ={[0,1]2 R, (x, u) I{uH(y +g(x) +n(x))} |yR, gC1 +1 [0,1], HH}.

    (37)

    For some constant L, H denotes the class of continuously differentiable distribution func-

    tions H:R [0, 1] with uniformly bounded derivative h, such that (36) is fulfilled with thesupremum is taken over zR, and the tail condition

    |1H(x)|d/x for all xR+, |H(x)|d/|x| for all xR (38)

    is satisfied for some > 1 +(1/), where the constant d is independent ofHH. The functionn is deterministic and converges to zero uniformly. Later we will set =/2 with from (5)

    and

    L = 2max{1, supyR

    | f(y)|+ supyR

    | f(y)|}. (39)

    Proposition 3

    The function classes Gn defined in (37) fulfils the conditionsn0

    log N[](,Gn, L2(P)) d0 (40)

    for every real sequence n 0. Here P denotes the probability distribution of (X1, U1), whereX1 has distribution FX, U1 is uniformly distributed in [0, 1] and they are independent. Further

    supn

    E

    n, H, y, g(U1, X1)n, H, y, g(U1, X1)

    20 (41)

    for every real sequence n

    0, where n, H, y, g denotes the element ofGn determined by H, y, g

    and the supremum is taken over ((H, y, g), (H, y, g)) 0

    [here we need the assumption that > 1 + 1/ in tail condition (38)]. Denote the centres by

    Hj, j= 1, . . ., c. Brackets for H of length 2 are then given by

    2009 Board of the Foundation of the Scandinavian Journal of Statistics.

  • 8/7/2019 Neumeyer N. (2009). Smooth residual bootstrap for empirical processes of non-parametric regrsesion residuals

    22/25

    Scand J Statist 36 Smooth residual bootstrap 225

    HLj , H

    Uj

    =

    Hj

    2

    2, Hj +

    2

    2

    , j= 1, . . ., c.

    We further have brackets [gL1k, gU1k] for the function class C

    1 +1 [0, 1], where k= 1, . . ., K=

    O(exp(2/(1 +)

    )) [see corollary 2.7.2, van der Vaart & Wellner (1996, p. 157)]. All bracketsintroduced so far are with respect to the supremum norm. Let us consider yR+ for thesake of simplicity and assume n0. Because the functions HH are increasing we assumethe same for HLj , H

    Uj and have for H

    Lj HHUj , gL1kg1 gU1k,

    I

    uHLj (y +gL1k(x) +n(x)) I{uH(y +g1(x) +n(x))} IuHUj (y +gU1k(x) +n(x))

    for all u, x [0,1], yR+. For each pair (j, k) we partition R+ in segments [yjk, yjk, + 1] =[yLjk, y

    Ujk], = 1, . . ., R = O(

    2) with probability

    H

    L

    j (yU

    jk +gL

    1k(x) +n(x))HL

    j (yL

    jk +gL

    1k(x) +n(x))

    fX(x) dx 2

    .

    Note that these segments may depend on n, but the number of segments does not. We have

    at most cKR = O(2 exp(2/(1 +))exp(2/(1 + a))) brackets of the formI

    uHLj (yLjk +gL1k(x) +n(x))

    , I

    uHUj (yUjk +gU1k(x) +n(x))

    that cover G and are of L2(P)-length less than or equal to (L + 3)1/2= (the latter is shown

    by several applications of the triangle inequality). We have shown that log N[](,Gn, L2(P)) =

    O((log ) +2/(1 +) +2/(1 + a)), where this number does not depend on n, and, hence, thebracketing condition (40) is fulfilled.

    To show (41) define

    T={(H, y, g) |HH, yR, gC1 +1 [0,1]}and the semimetric as

    ((H, y, g), (H, y, g)) =

    |H(y +g(x)) H(y + g(x))| fX(x) dx.

    Then the bracketing number N[](, T,) is finite for all > 0, which can be shown

    very similarly to the proof of (40). Hence, the semimetric space (T,) is totally bounded;

    see van der Vaart & Wellner (1996, p. 84). Now the supremum in (41) can be bounded by

    2supn

    |H(y +g(x) +n(x)) H(y + g(x) +n(x))|fX(x) dx. By the mean value theorem ap-

    plied to H and H, we obtain the bound

    2(supn

    ((H, y, g), (H, y, g)) + 2 supHH

    supzR

    |h(z)||n(x)|fX(x) dx),

    which converges to zero.

    Proposition 4

    Under model (1) with assumptions A.1A.7 the distribution functionFn of the residual-based den-

    sity estimator fn satisfies the tail condition, namely there exists a constant d such that

    |1 Fn(x)|d/x for all xR+, |Fn(x)|d/|x| for all xR, with probability converging to 1.

    Proof. We have for x > 0 (analogous considerations for x < 0)

    0x(1 Fn(x))

    x

    t fn(t) dt 1n

    nj= 1

    k(u)|anu + j| du

    and the assertion follows from (27) and assumption A.8.

    2009 Board of the Foundation of the Scandinavian Journal of Statistics.

  • 8/7/2019 Neumeyer N. (2009). Smooth residual bootstrap for empirical processes of non-parametric regrsesion residuals

    23/25

    226 N. Neumeyer Scand J Statist 36

    Appendix B Proofs of the main results

    Proof of lemma 1(i)

    From the presentation (18) one obtains

    Vn (y) =1

    n

    ni= 1

    I{Ui Fn(y + m(Xi) m(Xi))} I{Ui Fn(y)}

    Fn(y + m(x) m(x))fX(x) dx + Fn(y)

    .

    We further introduce the notations m m = gn +n, where gn = m mn ( m mn), n =mnmn, and mn and mn are defined in (19) and (20), respectively. Note that n is a determinis-tic function converging to zero uniformly by (22). Let > 0 denote the constant from band-

    width condition (5). Let H for =/2 and the Donsker class of smooth functions C1 +1 [0,1]

    be defined as before lemma 4. The probability that Fn belongs to H and gn belongs to

    C1 +/21 [0, 1] converges to one according to propositions 4 and 2 and lemmata 2 and 3;

    compare the definition ofH and (39). Therefore in what follows we assume that Fn H andgn C1 +/21 [0, 1]. We consider the following empirical process,

    Vn( y, H, g) =1

    n

    ni= 1

    I{UiH(y +g(Xi) +n(Xi))} I{UiH(y)}

    H(y +g(x) +n(x))fX(x) dx + H(y)

    indexed by y R, g C1 +/21 [0, 1] and HH. The random variables (X1, U1), (X2, U2), . . .are independent and identically distributed, where Xi has distribution function FX and Uiis uniformly distributed in [0, 1] (independent of Xi). Note that the process is centred and

    according to proposition 3, Vn(y, H, g) converges weakly to a Gaussian process. Applying

    a uniform version of lemma 19.24 by van der Vaart (1998, p. 280) [compare the proof of

    Th. 19.26 in that reference], we insert the random functions g = gn and H= Fn. To this end,

    we calculate (remember that gn +n = m m)

    supyR

    10

    10

    I{u Fn(y + m(z) m(z))} I{u Fn(y)}

    1

    0

    Fn(y + m(x)

    m(x)) fX(x) dx + Fn(y)

    2

    fX(z) dz du

    supyR

    10

    Fn(y + m(z) m(z)) Fn(y) fX(z) dz.

    A Taylor expansion of order one shows that the last term converges to zero in probability

    by lemmata 2 and 3 and (22). It can be shown now applying the mentioned results by van

    der Vaart (1998) that Vn (y) = Vn(y, Fn, gn) converges to zero in probability, uniformly withrespect to yR.

    Proof of lemma 1(ii)

    By Taylors expansion of Fn we obtain

    n

    Fn(y + m

    (x) m(x)) Fn(y)

    fX(x) dx

    = fn(y)

    n

    ( m(x) m(x))fX(x) dx + op(1) (42)

    2009 Board of the Foundation of the Scandinavian Journal of Statistics.

  • 8/7/2019 Neumeyer N. (2009). Smooth residual bootstrap for empirical processes of non-parametric regrsesion residuals

    24/25

    Scand J Statist 36 Smooth residual bootstrap 227

    uniformly in yR, where the proof of the negligibility of the remainder involves a decompo-sition of m(x) m(x) using An, Bn, Cn from (29) to (31), mn from (19) and mn from (20), andapplications of proposition 2 and lemma 2. In particular to obtain

    n

    C2n (x)fX(x) dx = op(1)

    we apply (32), (35) and (23). Details are omitted. Then the assertion follows from

    n

    ( m(x)m(x))fX(x) dx = 1

    n

    ni= 1

    i

    1

    hnK

    Xix

    hn

    dx

    +1

    n

    nj= 1

    j1

    nhn

    ni= 1

    K

    Xix

    hn

    dx

    fX(Xi)K

    XiXj

    hn

    + op(1)

    =1

    n

    ni= 1

    i +1

    n

    nj= 1

    j + op(1)

    and [compare the definition of m in (3)]

    n

    ( m(x)m(x))fX(x) dx = 1n

    nj= 1

    j + op(1).

    Proof of theorem 2

    From lemma 1 we have the following expansion of the bootstrap process defined in (11),

    Rn (y) = Rn(y) + op(1) uniformly with respect to yR, where

    Rn(y) =1

    n

    ni= 1

    I{i y} Fn(y) + fn(y)i

    . (43)

    Note that the process is centred with respect to the conditional expectation, given the

    sample Yn. To prove conditional weak convergence of the process Rn in probability, it is

    sufficient to consider Rn . We now show the stronger result that for almost all sequencesY={(X1, Y1), (X2, Y2), . . .} we have conditional weak convergence.This follows from stochasticasymptotic equicontinuity and convergence of the finite dimensional distributions. Note that

    P

    sup

    y, zR, |yz| Y

    P supy, zR, 0yz

    2

    Y (44)

    + P

    sup

    y, zR, |yz| 2Y (45)

    and we have to show that the limit lim0 lim supn of these probabilities is zero for almostall random sequences Y. To this end, let C denote a constant such that supyR f(y)C/2.The probability (44) can be bounded by

    I{supyR

    |f(y) fn(y)|> C/2}+ I{supyR

    |fn(y)|C}

    P

    supy, zR, 0yz 2Y

    .

    The first indicator function converges to zero almost surely due to lemma 2(i). We therefore

    assume that supyR |fn(y)|C. From the mean value theorem, we have that |Fn(y) Fn(z)|

    2009 Board of the Foundation of the Scandinavian Journal of Statistics.

  • 8/7/2019 Neumeyer N. (2009). Smooth residual bootstrap for empirical processes of non-parametric regrsesion residuals

    25/25

    228 N. Neumeyer Scand J Statist 36

    C|y z| and obtain for (44) the bound

    P

    sup

    s, t[0, 1], 0ts < C

    1n

    n

    i= 1 I{sUi t} t + s

    >

    2

    .

    Because the U1, . . ., Un are independent and uniformly distributed in [0, 1] and the uniform

    empirical process is asymptotically equicontinuous, the limit lim0 lim supn of this proba-bility is zero. Applying the triangle inequality as well as Markovs inequality, the probability

    (45) can be bounded by 4r(, n)2var(i |Yn)/2, wherer(, n) = sup

    y, zR, |yz| n1/2

    Yn

    2

    (A2 + B2z2)I{|z|>(n1/2A)/B}fn(z) dz.

    This integral converges almost surely to zero applying lemma 3.2 by Koul & Lahiri (1994),

    our lemma 2(iii), and

    z2fn(z) dz = var(i |Yn) from (46).

    We conclude the proof by calculating the conditional covariances,

    ERn (y)Rn(z)Yn=

    1

    n

    ni= 1

    E

    I{i y} Fn(y) + fn(y)i

    I{i z} Fn(z) + fn(z)i

    Yn

    = Fn(y z) Fn(y)Fn(z) + fn(y)fn(z)var(i|Yn)+ fn(y)E[

    i I{i z}|Yn] + fn(z)E[i I{i y}|Yn]. (47)

    With some technical effort one shows, using the definition of fn, uniform convergence of

    m and the strong law of large numbers, that E[i I{i z}|Yn] = U(z) + o(1) a.s. From this,(46) and lemma 2 it follows that the conditional covariances (47) converge almost surely to

    the asserted covariance from theorem 2.