Upload
independent
View
0
Download
0
Embed Size (px)
Citation preview
Seediscussions,stats,andauthorprofilesforthispublicationat:https://www.researchgate.net/publication/223237237
Goodness-of-fittestingunderlongmemory
ARTICLEinJOURNALOFSTATISTICALPLANNINGANDINFERENCE·DECEMBER2010
ImpactFactor:0.68·DOI:10.1016/j.jspi.2010.04.039
CITATIONS
8
READS
26
2AUTHORS:
HiraLalKoul
MichiganStateUniversity
125PUBLICATIONS2,447CITATIONS
SEEPROFILE
DonatasSurgailis
VilniusUniversity
139PUBLICATIONS2,327CITATIONS
SEEPROFILE
Availablefrom:DonatasSurgailis
Retrievedon:04February2016
Goodness-of-fit testing under long memory1
Hira L. Koul and Donatas Surgailis
Michigan State University and Vilnius Institute ofMathematics & Informatics
Abstract
This paper discusses the problem of fitting a distribution function to the marginaldistribution of a long memory moving average process. Because of the uniform re-duction principle, unlike in the i.i.d. set up, classical tests based on empirical processare relatively easy to implement. More importantly, we discuss fitting the marginaldistribution of the error process in location, scale and linear regression models. Aninteresting observation is that in the location model, or more generally in the linearregression models with non-zero intercept parameter, the null weak limit of the firstorder difference between the residual empirical process and the null model is degenerateat zero, and hence it can not be used to asymptotically distinguish between the twomarginal distributions that differ only in their means. This finding is in sharp contrastto a recent claim of Chan and Ling (2008) that the null weak limit of such a processis a continuous Gaussian process. This note also proposes some tests based on thesecond order difference for the location case. Another finding is that residual empiricalprocess tests in the scale problem are robust against not knowing the scale parameter.
1 Introduction
The problem of fitting a parametric family of distributions to a probability distribution,
known as the goodness-of-fit testing problem, is well studied in the literature when the
underlying observations are i.i.d. See, for example, Durbin (1973, 1975), Khmaladze (1979,
1981), D’Agostino and Stephens (1986), among others.
A discrete time stationary stochastic process with finite variance is said to have long
memory if its autocorrelations tend to zero hyperbolically in the lag parameter, as the lag
tends to infinity, but their sum diverges. The importance of these processes in econometrics,
hydrology and other physical sciences is abundantly demonstrated in the works of Beran
(1992, 1994), Baillie (1996), Dehling, Mikosch and Sorenson (2002), and Doukhan, Oppen-
heim and Taqqu (2003), and the references therein.
1AMS Classification: Primary 62M09; secondary 62M10, 62M99.Research of the first author supported in part by the NSF DMS grant 0704130 while that of the secondauthor supported in part by the bilateral France-Lithuania scientific project Gilibert and the LithuanianState Science and Studies Foundation grant T-70/09. Corresponding author: Koul. October 9, 2009Key words and phrases. Moving averages, residual empirical process, long memory
1
We model long memory via moving averages. Let Z := {0,±1, · · · }. We suppose for the
time being that the observable process is
εj =∞∑
k=0
bkζj−k, j ∈ Z,(1.1)
where ζs, s ∈ Z, are i.i.d., with zero mean and unit variance. The constants bj are assumed
to satisfy bk = 0, k < 0, b0 = 1, and
bj ∼ c0 j−(1−d), as j →∞, for some 0 < c0 < ∞ and 0 < d < 1/2.(1.2)
Let B(a, b) :=∫ 1
0xa−1(1− x)b−1dx, a > 0, b > 0. One can directly verify that as k →∞,
Cov(ε0, εk) =∞∑
j=0
bjbj+k ∼ c20
∫ ∞
0
yd−1(1 + y)d−1dy k−(1−2d) = c20B(d, 1− 2d)k−(1−2d),
so that the process εj, j ∈ Z, has long memory.
Now, let ε denote a copy of ε1, F denote the marginal d.f. of ε and F0 be a known d.f.
The problem of interest here is to test the simple hypothesis
H0 : F = F0, vs. H1 : F 6= F0.
This problem is of interest in applications. For example in the value at risk analysis, cf.
Tsay (2002), various probability calculations are based on the assumption that the under-
lying process is a Gaussian process. If one would reject the null hypothesis of a marginal
distribution being Gaussian then such analysis would be a suspect.
In this note we shall discuss asymptotic behavior of some omnibus tests for H0 based on
the empirical d.f. Fn(x) :=∑n
i=1 I(εi ≤ x)/n of εi, 1 ≤ i ≤ n for testing H0.
A bit more interesting and at the same time surprisingly challenging problem is to test
H loc0 : F (x) = F0(x− µ), ∀x ∈ R, for some µ ∈ R; vs.
H loc1 : H loc
0 is not true.
This problem is equivalent to testing for H0 based on the observations
Yi = µ + εi, i = 1, · · · , n, for some µ ∈ R,(1.3)
where now εi, i ∈ Z, are unobservable moving average errors of (1.1). Here tests would be
based on
Fn(x) := n−1
n∑i=1
I(Yi − Yn ≤ x), Yn := n−1
n∑i=1
Yi, x ∈ R.
2
Some times we may be interested in testing the equivalence of F to F0 up to a scale
parameter, i.e., to test
Hsc0 : F (x) = F0(x/σ), ∀ x ∈ R, for some σ > 0; vs.
Hsc1 : Hsc
0 is not true.
This problem is equivalent to testing for H0 based on the observations
Yi = σεi, i = 1, · · · , n, for some σ > 0,(1.4)
where again εi, i ∈ Z, are unobservable moving average errors of (1.1). Here tests will be
based on
Fn(x) := n−1
n∑i=1
I(Yi/σn ≤ x), σ2n := n−1
n∑i=1
Y 2i , x ∈ R.
Another interesting problem is to fit a distribution up to unknown location and scale
parameters, i.e., to test the hypothesis
H0 : F (x) = F0(x− µ
σ), ∀x ∈ R and for some µ ∈ R, σ > 0,
H1 : H0 is not true.
This is equivalent to testing for H0 based on the observations
Yi = µ + σεi, i = 1, · · · , n, for some µ ∈ R, σ > 0,(1.5)
where again εi, i ∈ Z, are unobservable as in (1.1). Here tests will be based on
Fn(x) := n−1
n∑i=1
I(Yi − Yn ≤ xsn), s2n := n−1
n∑i=1
(Yi − Yn)2, x ∈ R.
Next, consider the problem of fitting marginal error d.f. in the linear regression model
where one is given an array of p× 1 design vectors xni, 1 ≤ i ≤ n, and one observes an array
of random variables {Yni; 1 ≤ i ≤ n} from the model
Yni = x′niβ + εi, 1 ≤ i ≤ n, for some β ∈ Rp,(1.6)
with εi, i ∈ Z, as in (1.1). Now F denotes marginal d.f. of the error process. Consider the
problem of testing the above H0 vs. H1 based on Yni, 1 ≤ i ≤ n. Let βn be the LSE of β
and Fn denote the empirical d.f. of the residuals εi := Yni − x′niβn, 1 ≤ i ≤ n.
In the next section we discuss tests for the first problem based on Fn, while section 3
pertains to tests for H loc0 , Hsc
0 and H0 based on Fn, Fn, and Fn, respectively. It is observed
that the first order differences n1/2−d(Fn − F0) and n1/2−d(Fn − F0) can not be used to test
3
for H loc0 and H0, respectively, while tests based on n1/2−d(Fn − F0) for testing Hsc
0 have the
same large sample behavior as in the case of known σ.
Tests for fitting an error d.f. in linear regression set up based on n1/2−d(Fn − F0) are
discussed in section 4. This process also converges to zero in probability under H0 if there is a
non-zero intercept parameter in (1.6), and thus can not be used to test for H0 asymptotically.
Section 5 suggests some tests of H loc0 based on the second order difference n1−2d(Fn − F0).
The recent paper Chan and Ling (2008) discussed the asymptotics of residual empirical
processes and goodness-of-fit tests for long-memory errors. Although the first order approx-
imations in this paper are similar to ours, some conclusions of Chan and Ling (2008) for
hypotheses testing are incorrect (see Remark 3.1 below). Moreover, Chan and Ling (2008)
do not discuss tests based on second order differences.
2 Tests for simple hypothesis
Here, we shall analyze asymptotic behavior of the first order process n1/2−d(Fn − F0) when
εi, i ∈ Z, is an observable process. The Kolomorov test based on the supremum statistic
Kn := supx∈R
|Fn(x)− F0(x)|,
will be discussed in some detail.
To proceed, we need to assume, for some C < ∞ and δ > 0,
|Eeiuζ0| ≤ C(1 + |u|)−δ, E|ζ0|3 < ∞.(2.1)
Let c(θ) = c20B(d, 1−2d)
/d(1+2d), θ := (c0, d)′, and εn := n−1
∑ni=1 εi. Then, from Giraitis,
Koul and Surgailis (1996) (GKS) and Koul and Surgailis (2002), we obtain that under H0,
F0 is infinitely differentiable with smooth and bounded Lebesgue density f0, and
supx∈R
∣∣n1/2−d(Fn(x)− F0(x)) + f0(x) n1/2−dεn
∣∣ = op(1), (H0).(2.2)
Dehling and Taqqu (1989) proved an analog of (2.2) when εi, i ∈ Z, is a stationary
Gaussian process, and coined the phrase uniform reduction principle (URP) for this type
of result. Koul and Mukherjee (1993) established URP when εi, i ∈ Z, is subordinated to
a Gaussian process, and GKS proved it when εi, i ∈ Z, is a long memory moving average
process and under a higher moment assumption on ζ0. Koul and Surgailis (2002) obtained
the above expansion under the third moment assumption as in (2.1).
In the sequel, up(1) denotes a sequence of stochastic processes indexed by x ∈ R and
tending to zero, uniformly over x ∈ R, in probability, and =⇒ (→D) denotes the weak
4
convergence of a sequence of stochastic processes in the Skorohod space D(R) with the sup-
topology (r.v.’s), where R := [−∞,∞]. For any smooth function g from R to R, g and g
denote its first and second derivatives, respectively.
Now, by a result of Davydov (1970),
c−1/2(θ)n1/2−dεn →D N (0, 1).(2.3)
Hence, from (2.2) we obtain
n1/2−d(Fn(x)− F0(x)) =⇒ −c1/2(θ)f0(x)Z,(2.4)
where Z ∼ N (0, 1) r.v. Consequently, with ‖f0‖∞ := supx∈R f0(x),
n1/2−dKn →D c1/2(θ)|Z| ‖f0‖∞,
and we readily obtain that under H0,
Kn(θ) :=n1/2−dKn
c1/2(θ)‖f0‖∞ →D |Z|.
Implementation of this test requires a consistent and a log(n)- consistent estimators of
c0 and d, respectively. Dalla et al. (2006) show that the semi-parametric local Whittle
estimators c0 and d of c0 and d satisfy these conditions. Let θ := (c0, d)′. The proposed test
would reject H0 whenever Kn(θ) is large.
Clearly, the determination of the asymptotic critical values of this test is simple compared
to its analog in the i.i.d. set up. For example, the test that rejects H0 whenever Kn(θ) > zα/2
would be of asymptotic size α, 0 < α < 1, where zα is the 100(1−α)th percentiles of standard
normal distribution.
Needless to say similar statements apply to any other tests based on continuous func-
tionals of the first order difference n1/2−d(Fn − F0). For example, the Cramer - von Mises
test that rejects H0 whenever Cn(θ) > χ21−α would also be of asymptotic size α, where χ2
α is
the 100(1− α)th percentile of chi-square distribution with 1 degree of freedom. Here
Cn :=
∫[Fn(x)− F0(x)]2dF0(x), Cn(θ) :=
n1−2dCn
c(θ)∫
f 30 (x)dx
.
These findings are thus in complete contrast to the results available in the i.i.d. set up
where one must use the full knowledge of the distribution of Brownian bridge on [0, 1] to
implement tests based on the first order difference n1/2(Fn−F0) for large samples, cf., Durbin
(1973, 1975).
We shall now briefly analyze asymptotic power of the Kn test. Let F 6= F0 be another
marginal d.f. of ε such that (2.1) is satisfied and the local Whittle estimators c0, d are
5
consistent and log(n)-consistent for c0, d. Then, arguing as above, F has a smooth Lebesgue
density f and
Kn(θ) =supx∈R
∣∣− c1/2(θ)f(x)Z + n1/2−d(F (x)− F0(x))∣∣
c1/2(θ) ‖f0‖∞ + op(1).
From this we readily see that the above Kolmogorov test is consistent at this F . It has
trivial asymptotic power against sequences of alternatives for which supx n1/2−d|F (x)−F0(x)|→ 0. Thus this test can not distinguish the n1/2-neighborhoods of F0, i.e., this test has
asymptotic power α against those {F} in the class of d.f. satisfying the above assumed
conditions and for which n1/2 supx |F (x) − F0(x)| = O(1). Similar conclusions would hold
for other tests based on n1/2−d(Fn − F0).
Asymptotic power of the Kn(θ) test against the sequence of local alternatives F = F0 +
n−(1/2−d)∆, where ∆ is absolutely continuous with a.e. derivative ∆ bounded, equals
P(
supx| − f0(x)c1/2(θ)Z + ∆(x)| > zα/2c
1/2(θ)‖f0‖∞).
3 Testing for H loc0 , Hsc
0 , and H0
First, consider testing for H loc0 . Let, now εn :=
∑ni=1(Yi − µ)/n. Recall that here Fn(x) =
Fn(x + εn).
Proposition 3.1 (URP for the residual empirical process Fn.) Assume (1.2) and
(2.1) hold. Then,
supx∈R
n1/2−d∣∣Fn(x)− F0(x)
∣∣ = op(1).(3.1)
Proof. From (2.2), (2.4) and the mean value theorem one obtains
n1/2−d(Fn(x)− F0(x)) = n1/2−d[Fn(x + εn)− F0(x + εn) + εnf0(x + εn)
+
∫ x+εn
x
(f0(u)− f0(x + εn))du]
= up(1) + Op(‖f0‖∞n1/2−d|εn|2) = up(1).
This completes the proof. 2
According to Proposition 3.1, the first order difference Dn(x) := n1/2−d(Fn(x) − F0(x))
can not distinguish between the two marginal distributions of a long memory moving average
process that differ only in their means. This finding is in sharp contrast to what is available
in the case of i.i.d. errors where the first order difference n1/2[Fn(x)−F0(x)] converges weakly
to a time transformed Brownian bridge with a drift under this hypothesis, cf., Durbin (1973).
6
Remark 3.1 Proposition 3.1 contradicts Corollary 3.1 of Chan and Ling (2008) which
claims that Dn(x), x ∈ R converges weakly to a continuous Gaussian process under H loc0 .
This corollary is incorrectly derived from the first order expansion in Theorem 2.1 of Chang
and Ling (2008) which is correct and agrees with the expansion in (4.5) below.
Next, consider the scale problem. Here, εi = Yi/σ, and
Fn(x) =1
n
n∑i=1
I(Yi/σn ≤ x) = Fn(xσn/σ).
Proposition 3.2 (URP for the residual empirical process Fn.) Assume (1.2), (2.1)
hold and Eζ40 < ∞. Then,
supx∈R
∣∣n1/2−d[Fn(x)− F0(x)] + n1/2−dεn f0(x)∣∣ = op(1).
Proof. Without loss of generality, assume Eε2t =
∑∞j=0 b2
j = 1. We have
E(σ2n − σ2)2 = σ4n−2E
( n∑i=1
(ε2i − 1)
)2
= σ4n−2
n∑i,j=1
Cov(ε2i , ε
2j).
Let χ4 = Eζ40 + 3 be the 4th cumulant of ζ0. By elementary computation (see Giraitis and
Surgailis (1990, (2.3)),
Cov(ε2i , ε
2j) =
( ∞∑
k=0
bkbk+i−j
)2
+ χ4
∞∑
k=0
b2kb
2k+i−j
=(Cov(εi, εj)
)2+ O(|i− j|−2(1−d)),
implying Cov(ε2i , ε
2j) = O((Cov(εi, εj))
2) = O(|i − j|−2(1−2d)), as |i − j| → ∞. Whence it
follows that
σn
σ− 1 = Op(n
−1/2), 0 < d < 1/4,(3.2)
= Op((log(n)/n)1/2), d = 1/4,
= Op(n−(1−2d)), 1/4 < d < 1/2.
Now, let ∆n := (σn/σ − 1). Recall from Lemma 5.1 of Koul and Surgailis (2002) (KS)
that under (2.1), F0 is infinitely smooth with density f0 satisfying
supx∈R
(1 + x2)(f0(x) + |f0(x)|) < ∞.
7
This bound and (5.12) of Lemma 5.2 of KS imply
|F0(x + x∆n)− F0(x)| =∣∣∣∫ x∆n
0
f0(x + u)du∣∣∣ ≤ C(1 + x2)−1(|x∆n|+ x2∆2
n)(3.3)
≤ C(|∆n|+ ∆2n),
|f0(x + x∆n)− f0(x)| =∣∣∣∫ x∆n
0
f0(x + u)du∣∣∣ ≤ C(1 + x2)−1(|x∆n|+ x2∆2
n)
≤ C(|∆n|+ ∆2n).
Hence,
n1/2−d∣∣Fn(x)− F0(x) + εnf0(x)
∣∣≤ n1/2−d
∣∣Fn(xσn/σ)− F0(xσn/σ) + εn f0(xσn/σ)∣∣
+n1/2−dεn
∣∣f0(x)− f0(x + x∆n)∣∣ + n1/2−d
∣∣F0(x + x∆n)− F0(x)∣∣
≤ up(1) + Cn1/2−d(|∆n|+ ∆2n).
The statement of the proposition now follows from this bound, (3.2) and the fact that
0 < d < 1/2. 2
Note the URP for Fn is precisely the same as for Fn given in (2.2). In other words,
asymptotic null distribution of tests based on n1/2−d(Fn−F0) for testing Hsc0 is the same as
those of tests based on n1/2−d(Fn−F0) for testing H0. This is a kind of robustness property
of these tests against the unknown error variance. It is also unlike the above situation in
the location model, and unlike the situation in the i.i.d. set up, where n1/2(Fn − F0) weakly
converges to a Brownian bridge with a drift, cf., e.g., Durbin (1973).
Location-Scale Problem. Now consider the problem of testing for H0. Assume, as in
the scale problem, that Eζ40 < ∞. Let δn := (sn−σ)/σ and εi = (Yi−µ)/σ, εn := (Yn−µ)/σ.
Then, with Fn the same as in Proposition 3.1,
Fn(x) = n−1
n∑i=1
I(Yi − Yn ≤ xsn) = n−1
n∑i=1
I(εi ≤ x + xδn + εn)
= Fn(x + xδn).
Also note that the bound (3.2) continues to hold with σn replaced by sn under the current
set up. Using this fact, (3.1), and an argument for deriving the bound in (3.3), we readily
obtain
n1/2−d supx|Fn(x)− F0(x)| ≤ n1/2−d sup
x|Fn(x)− F0(x)|+ n1/2−d sup
x|F0(x + xδn)− F0(x)|
= op(1) + Cn1/2−d(|δn|+ δ2n)
= op(1), under H0.
8
Thus, here also the location case dominates in the sense that the first order difference
n1/2−d[Fn(x) − F0(x)], x ∈ R, is not useful for fitting a marginal d.f. up to the unknown
location and scale parameters.
4 Fitting an error d.f. in a regression model
Recall the linear regression model (1.6) and definition of Fn from section 1, viz.,
Fn(x) = n−1
n∑i=1
I(Yni − x′niβn ≤ x) = n−1
n∑i=1
I(εi ≤ x + x′ni(βn − β)),
where βn is the LSE of β ∈ Rp.
Proposition 4.1 (URP for the residual empirical process Fn). Assume the same
conditions on errors as in Proposition 3.1. Assume the matrix Xn whose ith row is x′ni, 1 ≤i ≤ n, is of full rank. Let Dn := (X ′
nXn)1/2. Assume additionally
n1/2 max1≤i≤n
‖D−1n xni‖ = O(1).(4.1)
Then,
supx∈R
∣∣∣n1/2−d[Fn(x)− F0(x)]− f0(x)n−1/2−d
n∑i=1
[x′ni(βn − β)− εi]∣∣∣ = op(1).(4.2)
Proof. To begin with we have uniform linearity of the residual empirical processes: For
any nonrandom array {ξni, 1 ≤ i ≤ n} of real numbers such that max1≤i≤n |ξni| = O(1),
supx∈R
∣∣∣n−1/2−d
n∑i=1
[I(εi ≤ x + ξnin
d−1/2)− I(εi ≤ x)− nd−1/2ξnif0(x)]∣∣∣ = op(1).(4.3)
This result was proved in GKS (Theorem 1(1.9)) under a higher moment assumption on
ζ0. Under the current third moment assumption it follows from (3.16) of Koul and Surgailis
(2003) and an argument as in GKS.
The result (4.3) entails the following fact. For any 0 < b < ∞ and any nonrandom array
{cni, 1 ≤ i ≤ n, n ≥ 1}, cni ∈ Rp, with max1≤i≤n ‖cni‖ = O(1),
supx∈R,‖s‖≤b
n−1/2−d
∣∣∣∣n∑
i=1
[I(εi ≤ x + c′nisn
d−1/2)− I(εi ≤ x)− c′nisnd−1/2f0(x)
] ∣∣∣∣ = op(1),(4.4)
This is proved using arguments as in Koul (2002). For the sake of completeness we are
reproducing this argument in the Appendix below.
9
Let vni := n1/2D−1n xni. Now write x′ni(βn − β) = v′nin
d−1/2n−dDn(βn − β). Recall that
Dn(βn − β) = D−1n
n∑i=1
xniεi = n−1/2
n∑i=1
vniεi.
In view of (1.2), we have |Eεiεj| ≤ C|i− j|2d−1, for all i 6= j. Hence, in view of (4.1),
E‖Dn(βn − β)‖2 = n−1
n∑i,j=1
v′nivnjEεiεj ≤ Cn−1
n∑i,j=1
|Eεiεj| ≤ Cn2d,
n−dDn(βn − β) = Op(1).
Now (4.4) applied with cni = vni, s = n−dDn(βn − β) and a routine argument yields
supx∈R
∣∣∣n1/2−d[Fn(x)− Fn(x)]− n−d−1/2
n∑i=1
x′ni(βn − β)f0(x)∣∣∣ = op(1).(4.5)
Write ∆n for the l.h.s. of (4.2). From (4.5) and (2.2) we obtain
∆n = supx∈R
∣∣∣n1/2−d[Fn(x)− Fn(x)]− f0(x)n−d−1/2
n∑i=1
x′ni(βn − β)
+ n1/2−d[Fn(x)− F0(x)] + f0(x)n−d−1/2
n∑i=1
εi
∣∣∣
= op(1),
proving (4.2). 2
Now, let vn = n−1∑n
i=1 vni, and
Zn := n−1/2−d
n∑i=1
[x′ni(β − β)− εi] = n−1/2−d
n∑i=1
[v′nvni − 1]εi.
Note that if the regression model (1.6) is the one sample location model (1.3), i.e., if in
(1.6), p = 1, xni ≡ 1, β = µ, then vni ≡ 1, LSE is Yn, Fn(x) = Fn(x) and Zn = 0, so that we
again obtain the conclusion (3.1).
More generally, Zn = 0, for all n ≥ 1, w.p.1, whenever there is a non-zero intercept
parameter in the model (1.6). To see this and to keep the exposition transparent, consider
the model (1.6) with p = 2, x′ni = (1, ai), where a1, · · · , an are some known constants with
σ2a :=
∑ni=1(ai − a)2 > 0, and a := n−1
∑ni=1 ai. Then,
v′nvni = n(1, a)(X ′nXn)−1
(1
ai
)
=n
σ2a
(1, a)
(n−1
∑nj=1 a2
j −a
−a 1
)(1
ai
)= 1, ∀ i = 1, · · · , n.
10
Thus, as in the location problem, as long as there is a non-zero intercept parameter
present in the regression model (1.6), the first order difference n1/2−d(Fn − F0) can not be
use to test for H0 in these cases.
Next, if in (1.6),∑n
i=1 xni = 0, then Zn = −n−1/2−d∑n
i=1 εi, and by (2.3), we again
obtain the analog of (2.4) for Fn. In other words, if the design vectors corresponding to the
slope parameters are orthogonal to the p vector of 1’s, then asymptotic null distribution of
n1/2−d(Fn − F0) is not affected by not knowing the slope parameters, and is the same as in
(2.4). This fact has nothing to do with long memory. Analogous fact is available in the i.i.d.
errors set up also, cf., Ch. 6 in Koul (2002).
In general Zn is a weighted sum of long memory moving average process. The following
lemma gives a CLT for such r.v.’s.
Lemma 4.1 Let εi, i ∈ Z, be a linear process as in (1.1) and (1.2). Let {cni ∈ Rp, 1 ≤ i ≤ n}be uniformly bounded nonrandom weights, i.e., supn≥1 sup1≤i≤n |cni| < ∞ and let
Σn := Cov(n−d−1/2
n∑i=1
cniεi
).
Assume that the limit limn→∞Σn = Σ exists and is positive definite. Then,
n−d−1/2
n∑i=1
cniεi →D Np(0,Σ).
The proof of Lemma 4.1 is given in Appendix. Now consider the weighted sum Zn.
Clearly, under (4.1), the weights cni ≡ v′nvni − 1 are uniformly bounded. So one needs only
the existence of the limit of Σn = Var(Zn) in order to apply the above CLT to Zn.
As an example of design where this limit exists and the above lemma is applicable to Zn,
consider the first degree polynomial regression through the origin where p = 1 and xni ≡ i/n.
Here, (4.1) is satisfied and an := n1/2D−1n = n1/2
( ∑ni=1(i/n)2
)−1/2 ∼ 31/2, vni := (i/n) an,
vn ∼ 31/2/2, and
Σn ≡ Σn(θ) ∼ 2c20B(d, 1− 2d)
1
n1+2d
∑1≤i<j≤n
[vn(
i
n)an − 1
][vn(
j
n)an − 1
] 1
(j − i)(1−2d)
∼ 2c20B(d, 1− 2d)
[(3
2)B(2, 2d)− 1
2d
][ 3
2(2 + 2d)− 1
2d + 1
]=: Σ(θ) ≡ Σ.
Hence we can apply the above lemma to Zn to conclude that Zn →D N (0, Σ(θ)). The
role of c(θ) in the simple hypothesis case is now played by Σ(θ). Consequently, here the
analog of Kolmogorov test for testing that the error d.f. is F0 would be based on
Kn :=n1/2−d supx |Fn(x)− F0(x)|
Σ(θ)1/2‖f0‖∞,
11
where now θ is the local Whittle estimator of θ based either on Yni’s or on the residuals
Yni−x′niβn’s. Consistency of θ for θ under (4.1) follows from Dalla et al. (2006). Other tests
based on Fn may be modified in a similar fashion. Needless to say these conclusions remain
valid for any finite degree polynomial model.
5 Tests for H loc0 based on the second order difference
In view of (3.1), it is desirable to ask the question whether there is a higher order difference
of Fn−F0 that will provide a reasonable test for testing H loc0 . In order to attempt to answer
this question, we need to recall a result from Ho and Hsing (1996) and Koul and Surgailis
(2002). Let ε(0)j := 1, ε
(1)j = εj of (1.1) and
ε(k)j :=
∑sk<···<s1≤j
bj−s1 · · · bj−skζs1 · · · ζsk
,
be a polynomial of order k ≥ 2 in the i.i.d. random variables ζs, s ∈ Z. This series
converges in mean square for each k ≥ 1, under the conditions∑∞
k=0 b2k < ∞, Eζ2
0 < ∞alone. Moreover, for each k ≥ 1, the process ε
(k)j , j ∈ Z, is strictly stationary with zero mean
and covariance
E(ε(k)0 ε
(k)j ) =
∑0≤i1<···<ik
bj+i1bi1 · · · bj+ikbik ,(5.1)
and for any integers j, i, Eε(k)j ε
(`)i = 0, (k 6= `, k, ` = 0, 1, · · · ). It follows from (1.2) and
(5.1), that for each k ≥ 1,
∣∣E(ε(k)0 ε
(k)j )
∣∣ ≤ 1
k!
( ∞∑i=1
|bj+ibi|)k
= O(j−k(1−2d)), j →∞,
E( n∑
j=1
ε(k)j
)2
= O(n2−k(1−2d)), k(1− 2d) < 1,
= O(n log(n)), k(1− 2d) = 1,
= O(n), k(1− 2d) > 1, n →∞.
Let
Z(k)n := nk( 1
2−d)−1
n∑j=1
ε(k)j , k ≥ 1.
Note that
Z(1)n = n1/2−dεn, Z(2)
n = n−2d
n∑j=1
∑s2<s1≤j
bj−s1bj−s2ζs1ζs2 .(5.2)
12
Assume 1/(1− 2d) is not an integer and let k∗ denote the greatest integer in 1/(1− 2d),
i.e., k∗ := [1/(1− 2d)]. Introduce the multiple Wiener-Ito integral
Z(k) :=ck0
k!
∫
Rk
{∫ 1
0
k∏j=1
(u− sj)−(1−d)+ du
}W (ds1) · · ·W (dsk)
w.r.t. a Gaussian white noise W (ds), E(W (ds))2 = ds, which is well-defined for 1 ≤ k ≤ k∗.
From Surgailis (2003a) we obtain under the conditions (1.2) and Eζ20 < ∞,
(Z(k)
n , 0 ≤ k ≤ k∗) →D
(Z(k), 0 ≤ k ≤ k∗
).(5.3)
Note that Z(1) is a Gaussian r.v. while Z(2) equals the Rosenblatt process at 1, cf. Taqqu
(1975).
We are now ready to state the following theorem giving higher order expansion of empir-
ical process due to Ho and Hsing (1996).
Theorem 5.1 Let {εj} be a long memory moving average satisfying (1.1) and (1.2). Sup-
pose the d.f. of ζ0 is k∗ + 3 times differentiable with bounded, continuous and integrable
derivatives, and E|ζ0|4 < ∞. Then, under H0,
Fn(x)− F0(x) =∑
1≤k≤k∗(−1)kn−
k(1−2d)2 Z(k)
n
dk−1f0(x)
dxk−1+ n−1/2Qn(x),
with
supx∈R
|Qn(x)| = Op(nδ), ∀ δ > 0.(5.4)
An open (although technical) problem is to show the existence of a Gaussian limit of
the remainder term Qn(x) in Theorem 5.1 (see the discussion in Koul and Surgailis (2002,
pp.220-221)). The last problem is solved in Koul and Surgailis (2002, Theorem 2.1) in the
important special case of Gaussian errors {εj}. Below, we formulate this result for k∗ = 1,
or 0 < d < 1/4, only. Let
Ri(x) := I(εi ≤ x)− F0(x) + f0(x)εi, i ∈ Z, x ∈ R.
Proposition 5.1 Let {εj} be a Gaussian long memory moving average satisfying (1.1) and
(1.2), 0 < d < 1/4. Then
(5.5) Qn(x) := n1/2{Fn(x)− F0(x) + f0(x)εn
}=⇒ Q(x),
where {Q(x), x ∈ R} is a Gaussian process with zero mean and covariance
Cov(Q(x), Q(y)) =∑
j∈ZCov(R0(x), Rj(y)).
13
Proposition 5.2 (i) Assume the same conditions as in Theorem 5.1, and let 1/4 < d < 1/2.
Then,
supx
∣∣n1−2d{Fn(x)− F0(x)} − f0(x)[Z(2)
n − 2−1(Z(1)n )2]
∣∣ = op(1),(5.6)
and
n1−2d{Fn(x)− F0(x)} =⇒ f0(x)Y ,(5.7)
where Y := Z(2) − 2−1(Z(1))2. Moreover,
EY = −2−1c(θ),(5.8)
EY2 =c40B
2(d, 1− 2d)
2d
{ 1
2(4d− 1)+
1
2d(1 + 2d)2− 1
d(4d + 1)− B(1 + 2d, 1 + 2d)
d
}.
(ii) Assume the conditions in Proposition 5.1, 0 < d < 1/4. Then
n1/2{Fn(x)− F0(x)} =⇒ Q(x),(5.9)
where Q(x), x ∈ R is the Gaussian process in (5.5).
Proof. (i) Note that d > 1/4 implies 2d− 1/2 > 0 and k∗ ≥ 2. Combine these facts with
(5.4) and (5.6) with a δ < 2d− 1/2 to obtain the second order expansion under H0:
(5.10) supx∈R
∣∣n1−2d{Fn(x)− F0(x) + f0(x)εn
}− f0(x)Z(2)n
∣∣ = op(1).
The decomposition
Fn(x)− F0(x) = Fn(x + εn)− F0(x + εn) + F0(x + εn)− F0(x)
and (5.10) yield
supx∈R
∣∣n1−2d{Fn(x)− F0(x) + εn f0(x + εn)} − f0(x + εn)Z(2)n
−n1−2d{F0(x + εn)− F0(x)}∣∣ = op(1).
Using Taylor expansion of f0, f0 and the boundedness of f0, we thus obtain
n1−2d{Fn(x)− F0(x)}= −n1−2dεnf0(x + εn) + f0(x + εn)Z(2)
n
−n1−2d{F0(x + εn)− F0(x)} + up(1)
14
= − n1−2dεn
[f0(x) + εnf0(x) +
(εn)2
2f0(x + ξn)
]
+[f0(x) + εnf0(x + ξn)
]Z(2)
n
+ n1−2d[εnf0(x) +
(εn)2
2f0(x) +
(εn)3
6f0(x + ξn)
]+ up(1)
= f0(x)[Z(2)
n − (εn)2
2n1−2d
]+ Op
(n1−2d(εn)3
)+ up(1),
where ξn is a sequence of r.v.’s with |ξn| ≤ |εn|, (εn)2n1−2d = (Z(1)n )2 →D (Z(1))2 and
n1−2d(εn)3 = op(nd−1/2) = op(1). This proves (5.6). Relation (5.7) follows from (5.6) and
(5.3).
(ii) Using the definition of Qn(x) in (5.5), rewrite n1/2(Fn(x)−F0(x)) = Qn(x+ εn)+Un(x),
where Un(x) := n1/2(F0(x + εn)− F0(x)− f0(x + εn)εn. By Taylor expansion,
supx|Un(x)| ≤ n1/2(εn)2 sup
x|f0(x)| = Op(n
2d−1/2) = op(1).
Therefore (5.9) follows from (5.5) and
(5.11) supx|Qn(x + εn)−Qn(x)| = op(1).
For any δ1, δ2 > 0,
P (supx|Qn(x + εn)−Qn(x)| > δ1) ≤ P (|εn| ≥ δ2) + P ( sup
|x−y|<δ2
|Qn(x)−Qn(y)| > δ1).
Since the sequence Qn, n ≥ 1 is tight with respect to the sup-topology on R, the last
probability can be made arbitrarily small by choosing δ2 small enough uniformly in all large
n. Since εn = op(1), this proves (5.11) and the proposition, too. 2
Proof of (5.8). From definition (5.3) and the diagram formula of Wiener-Ito integrals
(Surgailis (2003b)) one obtains
EZ(2) = 0,
E(Z(1))2 = c20
∫
R
{ ∫ 1
0
(u− s)d−1+ du
}2
ds =c20B(d, 1− 2d)
d(1 + 2d)= c(θ),
E(Z(1))4 = 3(E(Z(1))2
)2= 3c2(θ),
E(Z(2))2 = 2−1c40
∫
R2
{ ∫ 1
0
(u− s1)d−1+ (u− s2)
d−1+ du
}2
ds1ds2
=c40B
2(d, 1− 2d)
4d(4d− 1),
15
and
EZ(2)(Z(1))2 = c40
∫
R
∫
Rds1ds2
∫ 1
0
(u− s1)d−1+ (u− s2)
d−1+ du
×∫ 1
0
(v − s1)d−1+ dv
∫ 1
0
(w − s2)d−1+ dw
= c40B
2(d, 1− 2d)
∫ 1
0
∫ 1
0
∫ 1
0
|u− v|2d−1|u− w|2d−1 du dv dw
= c40B
2(d, 1− 2d)
∫ 1
0
{(2d)−1(u2d + (1− u)2d)
}2du
=c40B
2(d, 1− 2d)
2d2
( 1
4d + 1+ B(2d + 1, 2d + 1)
).
This completes the proof of Proposition 5.2. 2
Now, let d be a log(n) consistent estimator of d and
Dn := supx∈R
n1−2d|Fn(x)− F0(x)|.
From Proposition 5.2, we readily obtain that asymptotic null distribution of Dn is the same
as that of ‖f0‖∞Y . Consequently, the critical values of the test that rejects H loc0 whenever
Dn/‖f0‖∞ is large may be determined from the distribution of Y . Unfortunately this dis-
tribution depends on c0, d in a complicated fashion, and is not easy to track. However, one
may use a Monte Carlo method to simulate the distribution of this r.v. corresponding to
c0 = c0, d = d, where c0 is a consistent estimator of c0.
Higher order moments of the limiting r.v. Y can be computed using the diagram formula
for Wiener-Ito integrals; see e.g. Major (1981, Corollary 5.4), Surgailis (2003b, Prop. 5.1),
although the resulting formulas are rather cumbersome. The characteristic function of r.v.
Z(2) and the representation of it as a infinite weighted sum of chi-squares was obtained by
Rosenblatt (1961). See also Taqqu (1975, (6.2)).
6 Appendix
Proof of (4.4). For s ∈ Rp, z, y ∈ R put
W (s, z; y) := n−1/2−d
n∑i=1
[I(εi ≤ y + (c′nis + ‖cni‖z)nd−1/2)− I(εi ≤ y)
− f0(y)(c′nis + ‖cni‖z)nd−1/2]
Fact (4.3) applied with ξni = c′nis + ‖cni‖z implies
supy|W (s, z; y)| = op(1), ∀ s ∈ Rp, z ∈ R.(6.1)
16
Note W (s; y) := W (s, 0; y) is the empirical process on the left hand side of (4.4). It
suffices to show that for any ε > 0 there exists an Nε ≥ 1 such that for any n > Nε and for
every 0 < b < ∞,
(6.2) P(
supy∈R,‖t‖≤b
|W (t; y)| > 2ε)
< 2ε.
Fix a 0 < b < ∞ and an ε > 0. Because the ball Kb := {s ∈ Rp; ‖s‖ ≤ b} is compact,
for any δ > 0 there exist k = k(δ, b) and points s1, · · · , sk in Kb such that for any t ∈ Kb,
‖t− sj‖ < δ, for some j = 1, 2, · · · , k. Hence,
P(
supy∈R,‖t‖≤b
|W (t; y)| > 2ε)
(6.3)
≤k∑
j=1
P(
supt∈Kb,‖t−sj‖<δ,y∈R
|W (t; y)−W (sj; y)| > ε)
+k∑
j=1
P(
supy|W (sj; y)| > ε
).
Note that for any t ∈ Rp such that ‖t− sj‖ < δ,
c′nisj − ‖cni‖δ ≤ c′nit ≤ c′nisj + ‖cni‖δ, ∀ 1 ≤ i ≤ n.
Let δ1 := ε/(18C‖f‖∞), where C is such that maxi ‖cni‖ ≤ C. The above inequality and the
monotonicity of the indicator function imply that for all δ < δ1, and for all ‖t− sj‖ < δ,
|W (t; y)−W (sj; y)| ≤ |W (sj, δ1; y)|+ |W (sj,−δ1; y)|+ n−1
n∑i=1
‖cni‖f0(y) (2δ1 + δ)
≤ |W (sj, δ1; y)|+ |W (sj,−δ1; y)|+ 3Cf0(y)δ1.
Introduce the events
Anj :={
sup‖t−sj‖<δ;y∈R
|W (t; y)−W (sj; y)| > ε}
,
Bnj :={
supy|W (sj, δ1; y)|+ sup
y|W (sj,−δ1; y)| > ε/2
}.
Then, by (6.1), applied with z = ±δ1, there is an Nj,ε such that for all n > Nj,ε,
P (Bnj) ≤ ε/k, for each j = 1, · · · , k.
Let Bcnj denote the complement of Bnj. On Bc
nj,
sup‖t−sj‖<δ;y∈R
|W (t; y)−W (sj; y)| ≤ ε/2 + 3C‖f0‖∞ δ1, = ε/2 + ε/6 = 2ε/3 < ε,
for all j = 1, · · · , k. Hence, for all n > Nε := max1≤j≤k Nj,ε,
P (Anj) = P (Anj ∩ Bnj) + P (Anj ∩ Bcnj) ≤ ε/k, j = 1, · · · , k.
17
Similarly as above, by (6.1) applied with z = 0, there is an Mε such that for all n > Mε,
P(sup
y|W (sj; y)| > ε
)< ε/k, j = 1, · · · , k.
Hence, for all n > Nε ∨Mε,
P(
supy∈R,‖t‖≤b
|W (t; y)| > 2ε)
≤k∑
j=1
P (Anj) +k∑
j=1
P(sup
y|W (sj; y)| > ε
)
< 2ε,
proving (6.2) and hence, (4.4). 2
Proof of Lemma 4.1. Recall Eζ20 = 1. For a given a ∈ Rp, introduce a normal r.v.
U ∼ N (0, a′Σa). In (1.2), define bj = 0, j < 0, and let
bnj := n−d−1/2
n∑i=1
a′cnibi−j, Un := n−d−1/2
n∑i=1
a′cniεi =∑
j∈Zbnjζj.(6.4)
Then, Var(Un) = a′Σna → a′Σa = Var(U) and the statement of the lemma translates to
Un →D U .
Introduce truncated i.i.d. r.v.’s ζi,K := ζiI(|ζi| ≤ K) − EζiI(|ζi| ≤ K) (i ∈ Z), and let
Un,K be defined as in (6.4) with ζj’s replaced by ζj,K ’s. Then E(Un − Un,K)2 = Var(ζ0 −ζ0,K)Var(Un) → 0, as K → ∞, uniformly in n ≥ 1. Therefore, it suffices to show the
asymptotic normality of Un,K instead of Un, for each K fixed. Since ζi,K are bounded r.v.’s,
it thus suffices to prove the claimed result under the assumption that the r.v. ζi’s are bounded
having all moments finite. Then Un →D U follows if we show that
(6.5) Cumk(Un) = o(1), ∀ k = 3, 4, · · · ,
where Cumk(Y ) denotes the kth order cumulant of the r.v. Y . Let χk := Cumk(ζ0).
According to the multilinearity and additivity properties of cumulants,
Cumk(Un) = χk
∑
j∈Zbknj ≤ |χk| sup
j∈Z|bnj|k−2Var(Un)
since∑
j∈Z b2nj = Var(Un), see (6.4). Therefore, (6.5) follows from
(6.6) supj∈Z
|bnj|k−2 = o(1), ∀ k = 3, 4, · · · .
But,
|bnj| ≤ Cn−d−1/2
n∑i=1
(i− j)d−1+ = O(n−1/2) = o(1), ∀ j ∈ Z.
This proves (6.6) and also Lemma 4.1. 2
18
Acknowledgements
The authors are grateful to the two referees for valuable comments and careful reading of
the manuscript.
References
Baillie, R.T., 1996. Long memory processes and fractional integration in econometrics. J.
Econometrics 73, 5–59.
Beran, J., 1992. Statistical methods for data with long-range dependence. (With discus-
sion). Statistical Science 7, 404-427.
Beran, J., 1994. Statistics for Long Memory Processes. Monographs on Statistics and
Applied Probability, vol. 61. Chapman and Hall, New York.
Chan, N.G., Ling, S., 2008. Residual empirical processes for long and short memory time
series. Ann. Statist., 36, 2453–2470.
D’Agostino, R.B., Stephens, M. (Eds.), 1986. Goodness-of-fit techniques. Statistics: Text-
books and Monographs, vol. 68. Marcel Dekker, New York.
Dalla, V., Giraitis, L., Hidalgo, J., 2006. Consistent estimation of the memory parameter
for nonlinear time series. J. Time Ser. Anal. 27, 211–251.
Davydov, Y., 1970. The invariance principle for stationary process. Theory Probab. Appl.
15, 145–180.
Dehling, H., Taqqu, M.S., 1989. The empirical process of some long-range dependent
sequences with an application to U-statistics. Ann. Statist. 17, 1767–1783.
Dehling, H., Mikosch, T., Sørensen, M. (Eds.), 2002. Empirical Process Techniques for
Dependent Data. Birkhauser, Boston.
Doukhan, P., Oppenheim, G., Taqqu, M.S. (Eds.), 2003. Theory and Applications of Long-
Range Dependence. Birkhauser, Boston.
Durbin, J., 1973. Distribution theory for test based on the sample d.f. SIAM, Philadelphia.
Durbin, J., 1975. Kolmogorov-Smirnov tests when parameters are estimated with applica-
tions to tests of exponentiality and tests on spacings. Biometrika 62, 5–22.
Giraitis, L., Surgailis, D., 1990. A central limit theorem for quadratic forms in strongly
dependent linear variables and its application to asymptotic normality of Whittle’s
estimate. Prob. Theory Relat. Fields 86, 87-104.
Giraitis, L., Koul, H.L., Surgailis, D., 1996. Asymptotic normality of regression estimators
with long memory errors. Statist. Probab. Lett. 29, 317–335.
19
Ho, H.-C., Hsing, T., 1996. On the asymptotic expansion of the empirical process of long
memory moving averages. Ann. Statist. 24, 992–1024.
Khmaladze, E.V., 1979. The use of ω2 tests for testing parametric hypotheses. Theory
Probab. Appl. 24, 283–301.
Khmaladze, E.V., 1981. Martingale approach in the theory of goodness-of-fit tests. Theory
Probab. Appl. 26, 240–257.
Koul, H.L., 2002. Weighted Empirical Processes in Dynamic Nonlinear Models. Second
Edition. Lecture Notes in Statistics, vol. 166. Springer, New York.
Koul, H.L., Mukherjee, K., 1993. Asymptotics of R-, MD- and LAD-estimators in linear
regression models with long range dependent errors. Probab. Theory Related Fields
95, 535–553.
Koul, H.L., Surgailis, D., 2002. Asymptotic expansion of the empirical process of long
memory moving averages. In: Dehling, H., Mikosch, T., Sørensen, M. (Eds.), Empirical
Process Techniques for Dependent Data. Birkhauser, Boston, pp. 213–239.
Koul, H.L., Surgailis, D., 2003. Robust estimators in regression models with long memory
errors. In: Doukhan, P., Oppenheim, G., Taqqu, M.S. (Eds.), Theory and Applications
of Long-Range Dependence, Birkhauser, Boston, pp. 339–353.
Major, P., 1981. Multiple Wiener-Ito Integrals. Lecture Notes in Mathematics., vol. 849.
Springer, Berlin.
Rosenblatt, M., 1961. Independence and dependence. In: Proc. 4th Berkeley Symp. Math.
Statist. Probab. University of California Press, Berkeley, pp. 411–443.
Surgailis, D., 2003a. Non-CLTs: U-statistics, multinomial formula and approximations of
multiple Ito-Wiener integrals. In: Doukhan, P., Oppenheim, G., Taqqu, M.S. (Eds.),
Theory and Applications of Long-Range Dependence. Birkhauser, Boston, pp. 129–
142.
Surgailis, D., 2003b. CLTs for polynomials of linear sequences: diagram formula with
illustrations. In: Doukhan, P., Oppenheim, G., Taqqu, M.S. (Eds.), Theory and Ap-
plications of Long-Range Dependence, Birkhauser, Boston, pp. 111–127.
Taqqu, M.S, 1975. Weak convergence to Fractional Brownian Motion and to the Rosenblatt
Process. Z. Wahrsch. verw. Gebiete 31, 287–302.
Tsay, R.S., 2002. Analysis of Financial Time Series. Wiley series in Probability and
Statistics, New York.
20
Hira L. Koul
Department of Statistics & Probability
Michigan State University
East Lansing, MI 48824-1027, U. S. A.
Donatas Surgailis
Vilnius Institute of Mathematics
& Informatics
08663 Vilnius, Lithuania
21