12
Journal of Statistical Planning and Inference 86 (2000) 51–62 www.elsevier.com/locate/jspi Asymptotic normality of parametric part in partially linear models with measurement error in the nonparametric part 1 Hua Liang Department of Statistics, Texas A&M University, College Station, TX 77843-3143, USA Received 19 June 1997; accepted 26 March 1999 Abstract We consider the partially linear model relating a response Y to predictors (X; T ) with mean function X T + g(T ) when the T ’s are measured with additive error. We derive an estimator of by modication of a local-likelihood method. The resulting estimator of is shown to be asymptotically normal under the suitable conditions. c 2000 Elsevier Science B.V. All rights reserved. MSC: primary 62J99; 62H12; 62E25; 62F10; secondary 62H25; 62F10; 62F12; 60F05 Keywords: Errors in variables; Measurement error; Nonparametric likelihood; Partially linear model; Semiparametric models 1. Introduction and background The interest in studying measurement error models is growing with the publication of a series of papers on various topics, especially the monograph of Carroll et al. (1995). Broadly, it can be divided into two parts: the rst one focuses on linear measure- ment error model. See Anderson (1984), Carroll et al. (1984), Stefanski (1985) and Fuller (1987). The second one mainly deals with nonlinear measurement error models. See Fan and Truong (1993), Carroll et al. (1995) and the references therein. Liang et al. (1997) consider the semiparametric partially linear model relating a response Y to predictors (X; T ) with function X T + g(T ) when the X ’s are observed with ad- ditive error. The authors derived an estimator of , which is shown to be consistent and asymptotically normal. Suppose we interchange the roles of X and T; so that the parametric part is measured exactly and nonparametric part is measured with error, E(Y |X; T )= X + g(T ) and W = T + U , where U is measurement error. How about the 1 This research was supported by Alexander von Humboldt Foundation and in part by Sonderforschungsbereich 373 “Quantikation und Simulation Okonomischer Prozesse”. 0378-3758/00/$ - see front matter c 2000 Elsevier Science B.V. All rights reserved. PII: S0378-3758(99)00093-2

Asymptotic normality of parametric part in partially linear models with measurement error in the nonparametric part

Embed Size (px)

Citation preview

Page 1: Asymptotic normality of parametric part in partially linear models with measurement error in the nonparametric part

Journal of Statistical Planning andInference 86 (2000) 51–62

www.elsevier.com/locate/jspi

Asymptotic normality of parametric part in partially linearmodels with measurement error in the nonparametric part 1

Hua LiangDepartment of Statistics, Texas A&M University, College Station, TX 77843-3143, USA

Received 19 June 1997; accepted 26 March 1999

Abstract

We consider the partially linear model relating a response Y to predictors (X; T ) with meanfunction X T� + g(T ) when the T ’s are measured with additive error. We derive an estimatorof � by modi�cation of a local-likelihood method. The resulting estimator of � is shown to beasymptotically normal under the suitable conditions. c© 2000 Elsevier Science B.V. All rightsreserved.

MSC: primary 62J99; 62H12; 62E25; 62F10; secondary 62H25; 62F10; 62F12; 60F05

Keywords: Errors in variables; Measurement error; Nonparametric likelihood; Partially linearmodel; Semiparametric models

1. Introduction and background

The interest in studying measurement error models is growing with the publication ofa series of papers on various topics, especially the monograph of Carroll et al. (1995).Broadly, it can be divided into two parts: the �rst one focuses on linear measure-ment error model. See Anderson (1984), Carroll et al. (1984), Stefanski (1985) andFuller (1987). The second one mainly deals with nonlinear measurement error models.See Fan and Truong (1993), Carroll et al. (1995) and the references therein. Lianget al. (1997) consider the semiparametric partially linear model relating a response Yto predictors (X; T ) with function X T� + g(T ) when the X ’s are observed with ad-ditive error. The authors derived an estimator of �, which is shown to be consistentand asymptotically normal. Suppose we interchange the roles of X and T; so that theparametric part is measured exactly and nonparametric part is measured with error,E(Y |X; T )=�X +g(T ) and W =T +U , where U is measurement error. How about the

1 This research was supported by Alexander von Humboldt Foundation and in part bySonderforschungsbereich 373 “Quanti�kation und Simulation �Okonomischer Prozesse”.

0378-3758/00/$ - see front matter c© 2000 Elsevier Science B.V. All rights reserved.PII: S0378 -3758(99)00093 -2

Page 2: Asymptotic normality of parametric part in partially linear models with measurement error in the nonparametric part

52 H. Liang / Journal of Statistical Planning and Inference 86 (2000) 51–62

result in this situation? Liang et al. (1997) conjectured that � is estimable at parametricrates. The goal of this paper is just to present a detailed and positive answer underappropriate assumptions.Fan and Truong (1993) treated the case where � = 0 and T is observed with mea-

surement error. They proposed a new class of kernel estimators using deconvolution,and found that optimal local and global rates of convergence of these estimators dependheavily on the tail behavior of the characteristic function of the error distribution: thesmoother, the slower. This paper considers the semiparametric partially linear modelbased on a sample of size n,

Yi = X Ti � + g(Ti) + �i; (1)

where Xi is a random vector, Ti is a random variable de�ned on [0; 1], the functiong(·) is unknown, and the model errors �i are independent with conditional mean zerogiven the covariates. In model (1), the covariates Ti are measured with error, and wecan only observe their surrogates Wi, i.e.,

Wi = Ti + Ui; (2)

where the measurement errors Ui are independent and identically distributed, indepen-dent of (Yi; Xi; Ti), with mean zero and covariance matrix �uu. We will assume that Uhas a known distribution, which is proposed by Fan and Truong (1993) to assure thatthe model is identi�able. Model (1) with (2) can be seen as the mixture of linear andnonlinear errors-in-variables models. When T ’s are observable, previous works haveoften constructed root-n consistent estimator of � by a local-likelihood algorithm (seeEngle et al., 1986; Heckman, 1986; Chen, 1988; Speckman, 1988; Cuzick, 1992a,b;Severini and Staniswalis, 1994) as follows. Fix the parametric component � and obtainan estimate g(T; �) of the nonparametric component g(·) using some kind of smoothingmethod. For example, in the Severini and Staniswalis implementation, g(T; �) maxi-mizes a weighted likelihood assuming that the model errors �i are homoscedastic andnormally distributed, with symmetric kernel density function K(·) and bandwidth h.g(T; �) is then used to obtain an estimator of the parametric component of model (1),using methods such as maximum likelihood or least squares. For example, let thesolution of

minimizen∑i=1

{Yi − X Ti � − g(Ti; �)}2 (3)

be the estimate for �, which can be determined explicitly by a projected least-squaresalgorithm. Let gy; h(·) and gx; h(·) be the kernel regression estimators of E(Y |T ) andE(X |T ), respectively. Then

�n =[n∑i=1

{Xi − gx; h(Ti)}{Xi − gx; h(Ti)}T]−1

×n∑i=1

{Xi − gx; h(Ti)}{Yi − gy; h(Ti)}: (4)

It was shown that estimator (4) does not require undersmoothing and the usual band-width with order h ∼ n−1=5 leads to �n being asymptotically normal with mean zero

Page 3: Asymptotic normality of parametric part in partially linear models with measurement error in the nonparametric part

H. Liang / Journal of Statistical Planning and Inference 86 (2000) 51–62 53

and covariance matrix B−1CB−1, where B is the covariance matrix of X −E(X |T ) andC is the covariance matrix of �{X −E(X |T )}: Due to the disturbance of measurementerror U , the least-squares form of (4) has to be modi�ed since gx; h(Ti) and gy; h(Ti) arenot statistics any more. In the next section, we will de�ne a new estimator of �. Moreexactly, we have to de�ne a new estimator of g(·), and then perform the regression ofY and X on W . The asymptotic normality of the resulting estimator of � is proved todepend on the smoothness of the error distribution. Section 3 states our main result.Section 4 provides some simulation studies. Several remarks are given in Section 5.All technical proofs are postponed to the appendix.

2. Construction of estimators

As pointed out in the former section, our �rst objective is how to estimate the non-parametric function g(·) when T is observed with error. This can be overcome byborrowing the ideas of Fan and Truong (1993). Through using the deconvolution tech-nique, one can construct consistent nonparametric estimates of g(·) under appropriateassumptions. First, we brie y describe the deconvolution method, which has been stud-ied by Stefanski and Carroll (1990) and Fan and Truong (1993). Denote the densitiesof W and T by fW (·) and fT (·), respectively. As pointed in literature, fT (·) can beestimated by

fn(t) =1nhn

n∑j=1Kn

(t −Wjhn

)with

Kn(t) =12�

∫R1exp(−ist) �K (s)

�U (s=hn)ds; (5)

where �K (·) is the Fourier transform of K(·), a kernel function and �U (·) is thecharacteristic function of the error variable U . For detailed discussions, see Fan andTruong (1993). Denote

!ni(·) = Kn( · −Wi

hn

)/∑jKn

( · −Wjhn

)def=

1nhn

Kn

( · −Wihn

)/fn(·):

Noting the fact g(t) = E(Y − X T�|T = t); one de�nesgn(t) =

n∑i=1!ni(t)(Yi − X Ti �)

as the estimator of g(·) for given �. Let �n be the solution of (3) after replacingg(Ti; �) by gn(Wi). Then the generalized least-squares estimator �n of � is explicitlyindicated as

�n = (XTX)−1(X

TY); (6)

where Y denotes (Y 1; : : : ; Y n) with Y i=Yi−∑n

j=1 !nj(Wi)Yj and X denotes (X1; : : : ; Xn)

with X i=Xi−∑n

j=1 !nj(Wi)Xj. The estimator �n will be shown to possess asymptoticnormality under appropriate conditions.

Page 4: Asymptotic normality of parametric part in partially linear models with measurement error in the nonparametric part

54 H. Liang / Journal of Statistical Planning and Inference 86 (2000) 51–62

3. Main results

We make the following assumptions. Firstly some notation is introduced. j(t) =E(xij|Ti = t); Vij = xij − j(Ti) for i = 1; : : : ; n and j = 1; : : : ; p.

Assumption 1.1. sup06t61 E(||X1||3|T=t)¡∞, where E(V1V T1 )=B and B is a positive-de�nite matrix and Vi = (Vi1; : : : ; Vip)T:

Assumption 1.2. g(·) and j(·) are Lipschitz continuous of order 1.

Assumption 1.3. (i) The marginal density fT (·) of the unobserved covariate T isbounded away from 0 on [0; 1]; and has a bounded kth derivative, where k is a positiveinteger. (ii) The characteristic function of the error distribution �U (·) does not vanish.(iii) The distribution of the error U is ordinary smooth or super smooth.

The de�nitions of super smooth and ordinary smooth distributions are given by Fanand Truong (1993). We also state them for easy reference.

1. Super smooth of order �: If the characteristic function of the error distribution�U (·) satis�es

d0|t|�0 exp(−|t|�=�)6|�U (t)|6d1|t|�1 exp(−|t|�=�) as t → ∞; (7)

where d0; d1; � and � are positive constants and �0 and �1 constants.2. Ordinary smooth of order �: If the characteristic function of the error distribution�U (·) satis�es

d0|t|−�6|�U (t)|6d1|t|−� as t → ∞; (8)

for positive constants d0; d1 and �:

For example, standard normal and Cauchy distributions are super smooth with �=2 and1, respectively; Gamma distribution of degree p and double exponential distributionare ordinary smooth with � = p and 2, respectively. We should note that an errordistribution cannot be both super smooth and ordinary smooth.

Assumption 1.4. The kernel K(·) is a kth-order kernel function, that is

∫ ∞

−∞K(u) du= 1;

∫ ∞

−∞ulK(u) du

{=0; l= 1; : : : ; k − 1;6= 0; l= k:

Assumptions 1.1 and 1.2 are required to establish asymptotic normality with observedvalues Ti. Assumptions 1.3 and 1.4 are proposed by Fan and Truong (1993) for non-parametric estimation. We adopt them here in order to assure that the similar conclu-sions as Fan and Truong (1993) hold for the similar cases appeared below.Our main result concerns the limit distribution of �n.

Page 5: Asymptotic normality of parametric part in partially linear models with measurement error in the nonparametric part

H. Liang / Journal of Statistical Planning and Inference 86 (2000) 51–62 55

Theorem. Suppose that Assumptions 1:1–1:4 hold and that E(|�|3 + ||U ||3)¡∞. Ifeither of the following conditions holds; then �n is an asymptotically normal estimator;i.e. n1=2(�n − �)→L N(0; �2B−1); where B is given in Assumption 1:1.(i) When the error distribution is super smooth. X and T are mutually independent.

�K (t) has a bounded support on |t|6M0 and we take the bandwidth hn= c(log n)−1=�with c¿M0(2=�)1=�;(ii) When the error distribution is ordinary smooth; we take hn = dn−1=(2k+2�+1)

with d¿ 0 and 2k ¿ 2�+ 1; and suppose that

t��U (t)→ c; t�+1�′U (t) = O(1) as t → ∞

for some constant c 6= 0; and∫ ∞

−∞|t|�+1{�K (t) + �′

K (t)} dt ¡∞;∫ ∞

−∞|t�+1�K (t)|2 dt ¡∞:

4. Simulation

We conduct a moderate sample Monte-Carlo simulation to show the behavior ofthe estimator �n. A generalization of the model studied by Fan and Truong (1993) isconsidered.

Y = X T� + g(T ) + � and W = T + U with � = 0:75;

where X ∼ N(0; 1); T ∼ N(0:5; 0:252), � ∼ N(0; 0:00152) and g(t) = t3+(1 − t)3+: Twokinds of error distributions are examined to study their e�ects on the mean-squarederror (MSE) of the estimator �n: one is normal and the other is double exponential.

1. (Double exponential error) U has a double exponential distribution:

fU (u) = (√2�0)−1 exp(−

√2|u|=�0) for �20 = (3=7)Var(T ):

Let K(·) be chosen to be the Gaussian kernelK(x) = (

√2�)−1 exp(−x2=2);

then

Kn(x) = (√2�)−1 exp(−x2=2)

{1− �20

2h2n(x2 − 1)

}: (9)

2. (Normal error) U ∼ N(0; 0:1252): Suppose the function K(·) has a Fourier transformby �K (t) = (1− t2)2+: By (5),

Kn(t) =1�

∫ 1

0cos(st)(1− s2)3 exp

(0:1252s2

2h2n

)ds: (10)

Page 6: Asymptotic normality of parametric part in partially linear models with measurement error in the nonparametric part

56 H. Liang / Journal of Statistical Planning and Inference 86 (2000) 51–62

Table 1MSE(×10−3) of the estimator �nKernel n = 100 n = 500 n = 1000 n = 2000

MSE MSE MSE MSE

(9) 9.8333763 2.0718028 1.0052109 0.5405627(10) 7.950000 1.486650 0.721303 0.387888Quartic 13.006956 2.512046 2.394777 1.205762

For the above model, we use three di�erent kernels: Eqs. (9),(10) and quartic kernel(15=16)(1 − u2)2I(|u|61) (ignoring measurement error). Our aim is to compare theresults in cases of considering measurement error and ignoring measurement errors.The results with di�erent sample-numbers will be presented in N = 2000 replications.The mean-square errors (MSE) are calculated based on 2000 simulations with threekinds of kernels. Table 1 gives the �nal detailed simulation results. The results reportedshow that the behavior of MSE with the double exponential error model is the bestone while the behavior of MSE with the quartic kernel is the worst one.In the simulation procedure, we �t the nonparametric part using

gn(t) =n∑i=1!ni(t)(Yi − X Ti �n); (11)

where �n is the resulting estimator given in (6) corresponding to kernel (9). An analysisignoring measurement error (with the quartic kernel) �nds some curvature in T . SeeFig. 1 for the comparison of g(T ) with its estimator (11) using the di�erent-size samplebased on 2000 replications. Each curve represents the mean of 2000 realizations of thetrue curve or estimating curve.The bandwidth used in our simulation is selected using cross-validation to predict

the response. More precisely, we compute the average-squared error using a geometricsequence of 41 bandwidths ranging in [0:1; 0:5]. The optimal bandwidth is selectedto minimize the average-squared error among 41 candidates. The results reported heresupport our theoretical procedure, and illustrate that our estimators for both parametricand nonparametric parts work well numerically.

5. Discussion

The theorem shows us that in large sample case there is no cost due to measurementerror of T when the measurement error is ordinary smooth or super smooth but X andT are independent. That is, the estimator of � given by (6), under our assumptions,is equivalent to the estimator given by (4) when we suppose that Ti are known. Thisphenomenon looks surprising since the related work on the partially linear model sug-gests that a rate of at least n−1=4 is for the nonparametric estimation generally needed,and Fan and Truong (1993) proved that nonparametric function estimate can reach therate OP(n−k=(2k+2�+1)) {¡ o(n−1=4)} in the case of ordinary smooth error. In the case

Page 7: Asymptotic normality of parametric part in partially linear models with measurement error in the nonparametric part

H. Liang / Journal of Statistical Planning and Inference 86 (2000) 51–62 57

Fig. 1. Estimates of the function g(T ). The solid lines stand for true values and the dashed lines stand forthe value of the resulting estimator given by (11).

of ordinary smooth error, our later proof of Theorem indicates that g(T ) seldom a�ectsour estimate �n for the case where X and T are independent.If X and T are dependent, we conjecture that �n cannot reach the usual root-n

rate, which is required to be shown. On the other hand, one can easily derive theconvergence rate of the nonparametric function estimate for the ordinary smooth er-ror by directly copying the related procedure of Fan and Truong (1993). That ismin{OP(n−1=3);OP(n−k=(2k+2�+1))}. The cases that the parametric part X of the modelhas measurement error and the nonparametric part T is measured exactly or that theparametric part is measured exactly and the nonparametric part is measured with errorhave been discussed. In the more general situation, where X and T are both observedwith measurement errors, how to seek a parametric rate estimator of � is an interestingissue, which is still open.

Acknowledgements

The author would like to thank Professor Raymond J. Carroll and Mr. Sommer-feld Volker for their suggestions and valuable comments which greatly improved the

Page 8: Asymptotic normality of parametric part in partially linear models with measurement error in the nonparametric part

58 H. Liang / Journal of Statistical Planning and Inference 86 (2000) 51–62

presentation of this paper. He especially thanks one referee for pointing out an errorin the proof of Theorem of an earlier version of this paper.

Appendix

Lemma A.1 gives a rather general result on strong uniform convergence, which isapplied in the proofs of the present context. Its proof is given by Liang (1999).

Lemma A.1. Let V1; : : : ; Vn be independent random variables with 0 means and supjE|Vj|r 6C¡∞ (r¿2): Assume {aki; k; i = 1; : : : ; n} is a sequence of positive num-bers such that supi; k6n |aki|6n−p1 for some 0¡p1¡ 1 and

∑nj=1 aji = O(n

p2 ) forp2¿max(0; 2=r − p1): Then

max16i6n

∣∣∣∣ n∑k=1akiVk

∣∣∣∣=O(n−s log n) s= (p1 − p2)=2 a:s:

Lemma A.2 provides bounds for j(Ti)−∑n

k=1 !nk(Wi) j(Tk) and g(Ti)−∑n

k=1 !nk(Wi)g(Tk). The proof is mainly based upon the conclusion of Fan and Truong(1993).

Lemma A.2. Suppose that Assumptions 1:1 and 1:4 hold.(1) If U is ordinary smooth error; we take 2k ¿ 2�+ 1. Then

max16i6n

∣∣∣∣Gj(Ti)− n∑k=1!nk(Wi)Gj(Tk)

∣∣∣∣= o(n−1=4) for j = 0; : : : ; p;

where G0(·) = g(·) and Gl(·) = l(·) for l= 1; : : : ; p:(2) Assume that U is super smooth error and that X and T are independent. Then

the conclusions of (1) for j = 1; : : : ; p still hold; but

max16i6n

∣∣∣∣g(Ti)− n∑k=1!nk(Wi)g(Tk)

∣∣∣∣= o(1):Proof. We only prove the �rst part. The second part follows similarly. Theoreti-cally, we can apply the results of Fan and Truong (1993) to the modi�ed form∑n

k=1 !nk(w)(Yk − X Tk �) by absorbing XT� into Y for any given �. For the ordi-

nary smooth error, one can �nd from the main results of Fan and Truong (1993)that

max16i6n

∣∣∣∣g(Ti)− n∑k=1!nk(Wi){g(Tk) + �k}

∣∣∣∣=O(n−k=(2k+2�+1)) = o(n−1=4):

Page 9: Asymptotic normality of parametric part in partially linear models with measurement error in the nonparametric part

H. Liang / Journal of Statistical Planning and Inference 86 (2000) 51–62 59

Recalling the de�nition of !nk(·); one can verify that maxi6i6n |!nk(t)|=O(n−2=3) bytaking hn = n−1=3 (see Liang, 1994). This fact yields that

sup16i6n

∣∣∣∣ n∑k=1!nk(Wi)�k

∣∣∣∣= o(n−1=3 log n)by taking Vi= �i; aki=!nk(Wi); p1 =2=3 and p2 =0 in Lemma A.1. These argumentsimply that

max16i6n

∣∣∣∣g(Ti)− n∑k=1!nk(Wi)g(Tk)

∣∣∣∣=O(n−k=(2k+2�+1)) = o(n−1=4):The proofs for l(·) (l = 1; : : : ; p) are similar to the proof of Lemma 2 of Fan andTruong (1993). More precisely, noting the Lipschitz continuity of l(·); one obtainsthat for cn = n−1=3 log n,

max16i6n

∣∣∣∣ l(Ti)− n∑k=1!nk(Wi) l(Tk)I(|Ti − Tk |6cn)

∣∣∣∣=O(cn):On the other hand

n∑k=1!nk(Wi)I(|Ti − Tk |¿cn) =

n∑k=1

Kn((Wi −Wk)=hn)I(|Ti − Tk |¿cn)∑nj=1 Kn((Wi −Wj)=hn)

: (A.1)

Adopting the proof of Lemma 2 of Fan and Truong (1993), the orders of the denom-inator and the numerator of (A.2) are shown to be equal to the orders of ncnhn andnhn, respectively. Lemma A.2 for l(·) follows.

Lemma A.3. Under the conditions of Theorem;

limn→∞ n−1X

TX = B:

Proof. Denote � ns(Ti)= s(Ti)−∑n

k=1 !nk(Wi)Xks and ns(Ti)= s(Ti)−∑n

k=1 !nk(Wi)

s(Tk): It follows from Xjs= s(Tj)+Vjs that the (s; m)th element of XTX (s; m=1; : : : ; p)

isn∑j=1X jsX jm =

n∑j=1VjsVjm +

n∑j=1� ns(Tj)Vjm +

n∑j=1� nm(Tj)Vjs +

n∑j=1� ns(Tj) � nm(Tj)

def=n∑j=1VjsVjm +

3∑q=1R(q)nsm:

The strong law of large number implies that limn→∞ 1=n∑n

i=1 ViVTi = B: Notice that

� ns(Ti) is just ns(Ti) +∑n

k=1 !nk(Wi)Vks. In Lemma A.1, taking aik = !nk(Wi); Vk=Vks; p1 = 2=3 and p2 = 0, one obtains that

∑nk=1 !nk(Wi)Vks = o(1): This and Lemma

A.2 imply � ns(Ti)=o(1): Therefore, R(3)nsm=o(n), which together with Cauchy–Schwarz

inequality show that R(1)nsm = o(n) and R(2)nsm = o(n): This completes the proof of the

lemma.

Proof of the Theorem. We �rst outline the proof of the theorem. We decompose√n(�n − �) into three terms. Then we will calculate the tail probability value of each

Page 10: Asymptotic normality of parametric part in partially linear models with measurement error in the nonparametric part

60 H. Liang / Journal of Statistical Planning and Inference 86 (2000) 51–62

term. By the de�nition of �n,

√n(�n − �) =

√n(X

�X)−1

[n∑i=1X igni −

n∑i=1X i

{n∑j=1!nj(Wi)�j

}+

n∑i=1X i�i

]

def= A(n)

[1√n

n∑i=1X igni − 1√

n

n∑i=1X i

{n∑j=1!nj(Wi)�j

}

+1√n

n∑i=1X i�i

]; (A.2)

where A(n) = n−1XTX and gni = g(Ti)−

∑nk=1 !nk(Wi)g(Tk): Lemma A.3 means that

A(n) converges to B−1. Thus our problem is to prove the �rst two terms in theparentheses of the right-hand side of (A.2) converge in probability to zero and that(1=

√n)

∑ni=1 X i�i converges to a normal distribution with mean zero and covariancee

matrix �2B. The latter half of the assertion can be shown by using a central limittheorem and Lemma A.3. Let us now verify the former assertion under the conditionsof the Theorem. First, we state two facts, which will play critical roles in the proofs.Taking r = 3; Vk = �k or Vkl; aji =!nj(Wi); p1 = 2=3 and p2 = 0 in Lemma A.1, oneobtains the following equations:

maxi6n

∣∣∣∣ n∑k=1!nk(Wi)�k

∣∣∣∣=O(n−1=3 log n) a:s: (A.3)

and

maxi6n

∣∣∣∣ n∑k=1!nk(Wi)Vkl

∣∣∣∣=O(n−1=3 log n) for l= 1; : : : ; p a:s: (A.4)

Notice that the jth element of the �rst term in the parentheses of the right-hand sideof (A.2) is decomposed as

n∑i=1X ijgni =

n∑i=1Vijgni +

n∑i=1 ni(Tj)gni −

n∑i=1

n∑q=1!nq(Wi)Vqjgni:

For the case where U is ordinary smooth error. In Lemma A.1 we take r=2; Vk=Vkl; aji = gnj; p1 = k=(2k + 2�+ 1) (¿ 1=4) and p2 = 1− p1. Then∣∣∣∣ n∑

i=1Vijgni

∣∣∣∣=O(n−(2p1−1)=2):By Lemma A.2∣∣∣∣ n∑

i=1 ni(Tj)gni

∣∣∣∣6nmaxi6n|gni|max

i6n| ni(Tj)|= o(n1=2):

Using Abel’s inequality and Eq. (A.4)∣∣∣∣∣n∑i=1

n∑q=1!nq(Wi)Vqjgni

∣∣∣∣∣6n maxi6n|gni|max

i6n

∣∣∣∣∣n∑q=1!nq(Wi)Vqj

∣∣∣∣∣= o(n1=2):

Page 11: Asymptotic normality of parametric part in partially linear models with measurement error in the nonparametric part

H. Liang / Journal of Statistical Planning and Inference 86 (2000) 51–62 61

The above arguments imply that 1=√n∑n

i=1 X igni is o(1): Observe that

n∑i=1

{n∑k=1X kj!ni(Wk)

}�i

=n∑i=1

{n∑k=1Vkj!ni(Wk)

}�i +

n∑i=1

{n∑k=1 nk(Tj)!ni(Wk)

}�i

−n∑i=1

[n∑k=1

{n∑q=1Vqj!nq(Wk)

}!ni(Wk)

]�i:

We shall prove that each of the above three terms is o(n1=2): In Lemma A.1 we taker = 2; Vk = �k ; ali =

∑nk=1 Vkj!ni(Wk); 1=4¡p1¡ 1=3 and p2 = 1− p1: Then∣∣∣∣ n∑

i=1

{n∑k=1Vkj!ni(Wk)

}�i

∣∣∣∣=O(n−(2p1−1)=2 log n):By Lemma A.2 and Eq. (A.3), we get∣∣∣∣ n∑

i=1

{n∑k=1 nk(Tj)!ni(Wk)

}�i

∣∣∣∣6n maxk6n

∣∣∣∣ n∑i=1!ni(Wk)�i

∣∣∣∣maxk6n| nk(Tj)|

=O(n2=3cn log n):

Using Abel’s inequality and (A.3) and (A.4), we obtain∣∣∣∣∣n∑i=1

[n∑k=1

{n∑q=1Vqj!nq(Wk)

}!ni(Wk)

]�i

∣∣∣∣∣6n max

k6n

∣∣∣∣ n∑i=1!ni(Wk)�i

∣∣∣∣maxk6n

∣∣∣∣∣k∑q=1Vqj!nq(Wj)

∣∣∣∣∣=O(n1=3 log2 n) = o(n1=2):

We demonstrated the conclusion of the theorem for the ordinary smooth error.When U is super smooth error, we only need to prove that

n∑i=1X ijgni = o(

√n):

Since X and T are independent, by Chebychev’s inequality, and using the same argu-ments as in the proof of Lemmas A.3 and A.2(2), we conclude that

P{∣∣∣∣ n∑

i=1X ijgni

∣∣∣∣¿√n�}6

1n�2

E(

n∑i=1X ijgni

)2→ 0 for any given �¿ 0:

Therefore we complete the proof of the Theorem.

Page 12: Asymptotic normality of parametric part in partially linear models with measurement error in the nonparametric part

62 H. Liang / Journal of Statistical Planning and Inference 86 (2000) 51–62

References

Anderson, T.W., 1984. Estimating linear statistical relationships. Ann. Statist. 12, 1–45.Carroll, R.J., Spiegeman, C.H., Lan, K.G., Bailey, K.T., Abbott, R.D., 1984. On errors-in-variables for binaryregression models. Biometrika 71, 19–25.

Carroll, R.J., Ruppert, D., Stefanski, L.A., 1995. Nonlinear Measurement Error Models. Chapman & Hall,New York.

Chen, H., 1988. Convergence rates for parametric components in a partly linear model. Ann. Statist. 16,136–146.

Cuzick, J., 1992a. Semiparametric additive regression. J. Roy. Statist. Soc. Ser. B 54, 831–843.Cuzick, J., 1992b. E�cient estimates in semiparametric additive regression models with unknown errordistribution. Ann. Statist. 20, 1129–1136.

Engle, R.F., Granger, C.W.J., Rice, J., Weiss, A., 1986. Semiparametric estimates of the relation betweenweather and electricity sales. J. Amer. Statist. Assoc. 81, 310–320.

Fan, J., Truong, Y.K., 1993. Nonparametric regression with errors in variables. Ann. Statist. 21, 1900–1925.Fuller, W.A., 1987. Measurement Error Models. Wiley, New York.Heckman, N.E., 1986. Spline smoothing in partly linear models. J. Roy. Statist. Soc. Ser. B 48, 244–248.Liang, H., 1994. The Berry-Esseen bounds of error variance estimation in a semiparametric regression model.Comm. Statist. Theory Methods 23, 3439–3452.

Liang, H., 1999. An application of the Bernstein’s inequality. Econometrics Theory, to appear.Liang, H., H�ardle, W., Carroll, R.J., 1997. Large sample theory in a semiparametric partiallylinear errors-in-variables model. Discussion paper no. 27, Institut f�ur Statistik und �Okonometrie,Humboldt-Universit�at zu Berlin.

Speckman, P., 1988. Kernel smoothing in partial linear models. J. Roy. Statist. Soc. Ser. B 50, 413–436.Stefanski, L.A., 1985. The e�ects of measurement error on parameter estimation. Biometrika 72, 583–592.Severini, T.A., Staniswalis, J.G., 1994. Quasilikelihood estimation in semiparametric models. J. Amer. Statist.Assoc. 89, 501–511.