Semiparametric estimation of censored transformation models

This article was downloaded by: [The UC Irvine Libraries]On: 03 November 2014, At: 02:15Publisher: Taylor & FrancisInforma Ltd Registered in England and Wales Registered Number: 1072954 Registeredoffice: Mortimer House, 37-41 Mortimer Street, London W1T 3JH, UK

Journal of Nonparametric StatisticsPublication details, including instructions for authors andsubscription information:http://www.tandfonline.com/loi/gnst20

Semiparametric estimation of censoredtransformation modelsTue GØrgens aa Economics RSSS , Australian National University , Canberra, ACT,0200, AustraliaPublished online: 27 Oct 2010.

To cite this article: Tue GØrgens (2003) Semiparametric estimation of censored transformationmodels, Journal of Nonparametric Statistics, 15:3, 377-393, DOI: 10.1080/1048525031000120224

To link to this article: http://dx.doi.org/10.1080/1048525031000120224

PLEASE SCROLL DOWN FOR ARTICLE

Taylor & Francis makes every effort to ensure the accuracy of all the information (the“Content”) contained in the publications on our platform. However, Taylor & Francis,our agents, and our licensors make no representations or warranties whatsoever as tothe accuracy, completeness, or suitability for any purpose of the Content. Any opinionsand views expressed in this publication are the opinions and views of the authors,and are not the views of or endorsed by Taylor & Francis. The accuracy of the Contentshould not be relied upon and should be independently verified with primary sourcesof information. Taylor and Francis shall not be liable for any losses, actions, claims,proceedings, demands, costs, expenses, damages, and other liabilities whatsoever orhowsoever caused arising directly or indirectly in connection with, in relation to or arisingout of the use of the Content.

This article may be used for research, teaching, and private study purposes. Anysubstantial or systematic reproduction, redistribution, reselling, loan, sub-licensing,systematic supply, or distribution in any form to anyone is expressly forbidden. Terms &Conditions of access and use can be found at http://www.tandfonline.com/page/terms-and-conditions

http://www.tandfonline.com/loi/gnst20

http://www.tandfonline.com/action/showCitFormats?doi=10.1080/1048525031000120224

http://dx.doi.org/10.1080/1048525031000120224

http://www.tandfonline.com/page/terms-and-conditions

http://www.tandfonline.com/page/terms-and-conditions

Nonparametric Statistics, Vol. 15(3), June 2003, pp. 377–393

SEMIPARAMETRIC ESTIMATION OF CENSOREDTRANSFORMATION MODELS

TUE GØRGENS*

Economics RSSS, Australian National University, Canberra ACT 0200, Australia

(Received March 2002; Revised January 2003; In final form January 2003)

Many widely used models, including proportional hazards models with unobserved heterogeneity, can be written inthe form L(Y ) ¼ min [b0X þ U , C], where L is an unknown increasing function, the error term U has unknowndistribution function C and is independent of X , C is a random censoring threshold and U and C areconditionally independent given X. This paper develops new n1=2-consistent and asymptotically normalsemiparametric estimators of L and C which are easier to use than previous estimators. Moreover, Monte Carloresults suggest that the mean integrated squared error of predictions based on the new estimators is lower thanfor previous estimators.

Keywords: Semiparametric estimation; Kernel regression; Transformation model; Unobserved heterogeneity;Duration analysis; Censoring

JEL Classification: C13, C14, C24, C41

1 INTRODUCTION

Many widely used models, including for example loglinear models, accelerated failure-time

models, Box-Cox models, proportional hazards models, and mixed proportional hazards

models, can be written in the form L(Y ) ¼ b0X þ U or, for right-censored data,

L(Y ) ¼ min [b0X þ U , C], (1)

where L is an unknown strictly increasing real function, X is a random vector of explanatory

variables, b is an unknown vector of parameters, U is an unobserved stochastic disturbance

term, and C is a random censoring threshold. Throughout the paper the uncensored transfor-

mation model will be treated as a special case of (1) where C ¼ 1 with probability 1. The

following assumptions are standard in the literature and maintained throughout the paper: U

is independent of X , U and C are conditionally independent given X, and the distribution of

the index b0X is absolutely continuous. Let C denote the distribution function of U.

Heckman and Singer (1984) proposed a semiparametric estimator of C assuming a

parametric specification of L. Murphy (1994; 1995) and Nielsen et al. (1992) developed

* E-mail: [email protected]

ISSN 1048-5252 print; ISSN 1029-0311 online # 2003 Taylor & Francis LtdDOI: 10.1080=1048525031000120224

Dow

nloa

ded

by [

The

UC

Irv

ine

Lib

rari

es]

at 0

2:15

03

Nov

embe

r 20

14

semiparametric estimators of L assuming a parametric specification of C. Semiparametric

estimators of L and C, which do not assume that either L or C belong to known finite-

dimensional parametric families of functions, were first developed by Horowitz (1996)

for uncensored data and extended to censored data by Gørgens and Horowitz (1999).

These estimators are n1=2-consistent and asymptotically normal, and they perform well in

Monte Carlo experiments. However, their practical use is hampered by their complexity of

implementation as they require the researcher to choose weight functions, kernel functions,

and bandwidths.

This paper proposes new semiparametric estimators of L and C, which are easier to

implement because they require only one kernel and one bandwidth whereas the previous

estimators require two of each. In applications, bandwidths are often determined by cross-

validation methods or by visual inspection of the estimates. In either case, it is necessary to

evaluate the estimator at many different bandwidths. Reducing the number of bandwidth

parameters from two to one simplifies application substantially. Thus the paper takes a sig-

nificant step towards making semiparametric estimation of transformation models practical

in empirical applications. (GAUSS programs are available from the author upon request.)

Moreover, the new estimators perform better than previous estimators in a small set of

Monte Carlo experiments, in the sense that they lead to predictions of Y which have

lower mean integrated squared errors.

The censoring scheme assumed here is so-called random censorship, where C is random

and conditionally independent of U given X. Random censoring is often encountered in prac-

tice. An example is duration studies in which spells begin at random times and the termina-

tion date of the study is predetermined. Other special cases of random censorship include

type I or ‘‘time’’ censoring, where C ¼ �cc for some constant �cc, and the case of no censoring,

where C ¼ 1. Examples of censoring schemes not included are type II or ‘‘order statistic’’

censoring, where sampling continues until a predetermined number of uncensored observa-

tions has been collected, and schemes where censoring depends on the value of Y. Note that

some censoring schemes may be reduced to type I censoring by introducing additional (arti-

ficial) censoring.

Both the previous and the new estimators require that b be estimated before L and C. No

new estimator of b is proposed in this paper. Model (1) is a single-index model if C is inde-

pendent of X, which includes the uncensored transformation case, or if C depends on X only

through the index b0X . Methods for estimating b (up to scale) in single-index models have

been devised by Han (1987), Hardle and Stoker (1989), Horowitz and Hardle (1996),

Ichimura (1993), Powell et al. (1989), Sherman (1993), and Ai (1997). An estimator of bwhen C may depend on X other than through b0X was proposed by Gørgens (1999).

Currently this is the only estimator available for the general case.

The paper is organized as follows. The estimators are developed in Section 2, and their

limiting distributions are discussed in Section 3. Monte Carlo results are presented in

Section 4, and Section 5 concludes. Proofs of theorems are relegated to the Appendix.

2 ESTIMATORS

The data available for estimation are assumed to consist of independent observations of X , Y ,

and an indicator of no censoring,

M ¼1 if X 0bþ U � C

0 otherwise.

�(2)

378 T. GØRGENS

Dow

nloa

ded

by [

The

UC

Irv

ine

Lib

rari

es]

at 0

2:15

03

Nov

embe

r 20

14

Scale and location normalizations are needed to identify the unknowns b, L, and C.

Identification of b (up to a scale parameter) requires that X have at least one component

which is continuously distributed conditional on the others and whose coefficient is nonzero

(see for instance Ichimura, 1993). Let this be the first component of X. For scale normaliza-

tion, let the first component of b be either 1 or �1, and normalize the location of the model

by setting L(t0) ¼ L0, where t0 and L0 are arbitrary constants (t0 must be in the interval on

which L is estimated, see below).

Similarly to Horowitz’s (1996) and Gørgens and Horowitz’s (1999) estimators, the new

estimators are based on the analog principle. The first step in constructing the estimators

is to express L and C in terms of the density z of Z ¼ b0X and the conditional distribution

function F of the uncensored Y given Z. The second step consists of replacing z and F in

these expressions by appropriate kernel estimators. What distinguishes the new from the pre-

vious estimators is the choice of estimating expressions in the first step: the new estimators

are based on quantiles whereas the previous estimators are based on derivatives. This means

that the estimator of F( y j z) in the second step need not be differentiable with respect to y

(although differentiability with respect to z is still required in order to eliminate asymptotic

bias). Consequently, no smoothing in the y-direction is needed and therefore the new estima-

tors require only one bandwidth and kernel function, whereas previous estimators require two

of each.

Estimators are proposed for two cases. In case I the data are assumed to be either uncen-

sored or type I censored, but the range of Y is unrestricted. This is similar to Horowitz (1996).

In case II Y must be nonnegative, but a general random censoring mechanism is allowed. This

case is relevant for many studies of duration data and was considered by Gørgens and

Horowitz (1999). The proposed new estimators coincide with each other when the two

cases overlap, that is, when the data are uncensored or type I censored and Y is nonnegative.

The estimating expressions for L and C are derived in Section 2.1, estimation of z and F is

discussed in Section 2.2 for case I and Section 2.3 for case II.

2.1 Estimating Equations

Define Z ¼ b0X and define the uncensored, latent variable Y � by L(Y �) ¼ Z þ U. Let zdenote the density of Z, and define F(t j z) ¼ Pr (Y � � t j Z ¼ z). In this section L and Care expressed as functionals of z and F.

Let IY be the support of Y. If the data are type I censored at C ¼ �cc, then �yy ¼ L�1(�cc) is an

upper bound of IY . If the data are uncensored, IY may be all of R. Assuming that L is strictly

increasing and that U and X are independent, then for t in IY

F(t j z) ¼ C(L(t) � z), (3)

where C is the distribution function of U. (All expressions conditional on Z ¼ z are valid for

z such that z(z) > 0 unless otherwise stated.) Let C�1(q) ¼ inf {u 2 R: C(u) � q} be the qth

quantile of U, and define the generalized inverse of F(t j �) by

G(t, q) ¼ sup {z 2 R: z(z) > 0, F(t j z) � q}

¼ sup {z 2 R: z(z) > 0, z � L(t) �C�1(q)}: (4)

The set on the right-hand side of (4) may be empty, in which case G(t, q) is infinite. The

second line shows that G(t, q) is well-defined and equals L(t) �C�1(q) if

z(L(t) �C�1(q)) > 0. To write this compactly, for any e � 0 define the set

SD(e) ¼ {(t, q) 2 IY � (0, 1): z(L(t) �C�1(q)) > e}: (5)

CENSORED TRANSFORMATION MODEL ESTIMATION 379

Dow

nloa

ded

by [

The

UC

Irv

ine

Lib

rari

es]

at 0

2:15

03

Nov

embe

r 20

14

Then G(t, q) ¼ L(t) �C�1(q) for (t, q) 2 SD(0), and it follows that

L(t) ¼ L0 þ G(t, q) � G(t0, q), (t, q), (t0, q) 2 SD(0): (6)

Given t0 and q, Eqs. (4) and (6) determine L as a function of z and F. Essentially, the new

estimator of L is obtained by replacing z and F in (4) by sample analogs and substituting the

resulting estimator Gn for G in (6).

Estimation of z and F is discussed in Sections 2.2 and 2.3. Given estimators zn and Fn,

estimate G by

Gn(t, q) ¼ sup {z 2 R: zn(z) > 0, Fn(t j z) � q}: (7)

Under fairly weak conditions these estimators converge to their population counterparts, and

hence L0 þ Gn(t, q) � Gn(t0, q) is a consistent estimator of L(t). However, the optimal rate

of convergence of zn, Fn, and hence Gn, depends on the smoothness of L, C, and z, but n1=2-

convergence cannot be achieved (see for instance Stone, 1980; Hardle et al., 1988).

Therefore, for any given value of q, simply replacing G(t, q) by Gn(t, q) in (6) will not result

in a n1=2-consistent estimator of L(t). Fortunately, n1=2-consistency can be obtained by aver-

aging over a range of q-values. Since the estimators of F involve random denominators

(essentially 1=zn), they converge uniformly only on sets of the form {(t, z) 2 R2: z(z) >cz}, where cz > 0 is a small constant. Therefore, Gn is uniformly consistent only on

SD(cz), as opposed to SD(0). This implies that averaging should take place only over values

of q such that z(L(t) �C�1(q)) > cz.

Now to define the new estimator of L, choose an interval TL � IY on which L is to be

estimated, and let WL be a weight function such that WL( � , t) is supported on (0, 1) and inte-

grates to 1 for all t 2 TL. Furthermore, the support of WL must be chosen to ensure that (t, q)

and (t0, q) are in SD(cz) whenever t 2 TL and WL(q, t) 6¼ 0. In other words, WL(q, t) 6¼ 0

should imply z(G(t, q)) > cz and z(G(t0, q)) > cz. An example of a weight function is

given in the Monte Carlo section. Define the new estimator Ln by

Ln(t) ¼ L0 þ

ð1

0

WL(q, t)(Gn(t, q) � Gn(t0, q)) dq: (8)

Under assumptions stated in Section 3, Ln is uniformly n1=2-consistent on TL and asympto-

tically normally distributed.

Turning now to estimation of C, it follows from Eq. (3) that

C(u) ¼ F( y j L( y) � u), ( y, u): z(L( y) � u) > 0: (9)

Once F and L are estimated, Eq. (9) can be used to estimate C. More specifically, to estimate

C, choose an interval TC � R on which C is to be estimated, and let WC be a weight function

such that WC( � , u) integrates to 1 for each u 2 TC. In addition, since L is only estimated

on TL and Fn(t j z) is consistent only if z(z) > cz, the support of WC( � , u) must be

chosen such that WC( y, u) 6¼ 0 implies y 2 TL and z(L( y) � u) > cz. Define the new

estimator of C by

Cn(u) ¼

ðTL

WC( y, u)Fn( y j Ln( y) � u) dy: (10)

It is shown in Sec. 3 that Cn is n1=2-consistent and asymptotically normal.

This completes the first step in developing the new estimators: Eqs. (7), (8) and (10) give

estimators Ln and Cn in terms of zn and Fn. The second step is to estimate z and F, which is

discussed in the next two subsections.

380 T. GØRGENS

Dow

nloa

ded

by [

The

UC

Irv

ine

Lib

rari

es]

at 0

2:15

03

Nov

embe

r 20

14

2.2 Estimating F: Case I

In the case of uncensored data or type I censored data, z and F can be estimated by the

Rosenblatt-Parzen and the Nadaraya-Watson kernel estimators. Let {(Xi, Yi, Mi)}ni¼1 be a ran-

dom sample. Choose an estimator bn, say, from the list given in the introduction. Choose also a

bandwidth kzn and a kernel function Kz; that is, let kzn be a positive real number and let Kz be a

function which integrates to 1. To achieve the bias reduction necessary to attain n1=2-conver-

gence, the kernel function must be of fourth order and twice differentiable. Other (standard)

regularity conditions that bn, kzn, and Kz must satisfy are given in Section 3. Define

Kni(z) ¼1

kznKz

b0nXi � z

kzn

� �: (11)

Let 1(A) be the indicator function of the event A. The Nadaraya-Watson estimator of F(t j z) is

FIn(t j z) ¼

A0n(t, z)

zn(z), (12)

where

A0n(t, z) ¼1

n

Xni¼1

Kni(z)1(Yi � t) (13)

and zn is the Rosenblatt-Parzen density estimator,

zn(z) ¼1

n

Xni¼1

Kni(z): (14)

The function F In(� j z) is a right-continuous step-function which jumps (up or down) by Kni at

each Yi, whereas F In(t j �) is a continuous and differentiable function if Kz is.

2.3 Estimating F: Case II

Now consider the case of general random censoring, but assume that Y is nonnegative. The

Rosenblatt-Parzen estimator of the density z of Z is unaffected by the censoring of Y , but

with random censoring the Nadaraya-Watson estimator of F is inconsistent. Fortunately

another estimator is available, namely, the conditional product-limit estimator (also known

as the conditional Kaplan-Meier estimator).

The following results are standard in duration analysis (see for instance Kalbfieisch and

Prentice, 1980; Gill and Johansen, 1990). Define F1(t j z) ¼ Pr(Y � t, M ¼ 1 j Z ¼ z) and�FF2(t j z) ¼ Pr(Y � t j Z ¼ z). Assuming that C and U are conditionally independent given

X , the conditional integrated hazard function H(� j z) of Y � given Z ¼ z is (using the

Stieltjes integral)

H(t j z) ¼

ðt0

F1(dy j z)

�FF2( y j z), (15)

and the conditional distribution function of Y � given Z ¼ z is

F(t j z) ¼ 1 � exp ( � Hc(t j z))Yy2DH

(1 � Hd( y j z))1( y�t), (16)

where Hc(� j z) is the continuous component of H(� j z) and Hd( y j z) ¼ H( y j z)�

H( y� j z), and DH denotes the set of discontinuity points of H .


Dow

nloa

ded

by [

The

UC

Irv

ine

Lib

rari

es]

at 0

2:15

03

Nov

embe

r 20

14

Let {(Xi, Yi, Mi)}ni¼1, bn, kzn, and Kz be as in Section 2.2. Define the Nadaraya-Watson

type estimators F1n(t j z) ¼ A1n(t, z)=zn(z) and �FF2n(t j z) ¼ �AA2n(t, z)=zn(z), where

A1n(t, z) ¼1

n

Xni¼1

Kni(z)1(Yi � t)Mi (17)

and

�AA2n(t, z) ¼1

n

Xni¼1

Kni(z)1(Yi � t): (18)

Given zn, A1n, and �AA2n, estimate H by replacing F1 and �FF2 in (15) by F1n and �FF2n. Since

F1n(� j z) is a right-continuous step function and the jump points occur at the uncensored

Yis, the expression simplifies as follows (assuming no ties in the last line)

Hn(t j z) ¼

ðt0

F1n(dy j z)

�FF2n( y j z)

¼Xni¼1

1(Yi � t)n�1Kni(z)Mi

�AA2n(Yi, z): (19)

Note that in the unconditional case with no Z-variable, Hn is the well-known Nelson-Aalen

estimator. Now replace H in (16) by Hn. Since Hn(� j z) is also a right-continuous step func-

tion with jump points at the uncensored Yis, the continuous component Hcn (� j z) is 0, so the

expression simplifies to (again assuming no ties)

F IIn (t j z) ¼ 1 �

Yni¼1

(1 � [Hn(Yi j z) � Hn(Y�i j z)])1(Yi�t)

¼ 1 �Yni¼1

1 � 1(Yi � t)n�1Kni(z)Mi

�AA2n(Yi, z)

� �: (20)

As in the case of F In, the function F II

n (� j z) is a right-continuous step-function with jump

points at discontinuity points of A1n, namely, at the uncensored Yis, and F IIn (t j �) is a

continuous and differentiable function if Kz is. In the unconditional case, F IIn is simply the

Kaplan-Meier estimator. In case of uncensored or type I censored data, F IIn is identical to

F In, and therefore the case I and case II estimators of L and C coincide.

3 ASYMPTOTIC DISTRIBUTIONS

Under conditions given in the assumptions below, the new estimators are n1=2-consistent and

asymptotically normally distributed. The precise results are stated in Theorems 1 and 2

below. The assumptions and results are similar for cases I and II, and it is convenient to

discuss them together.

Let P denote the distribution of (X 0, Y , M )0, and let Pn denote the empirical measure

formed from the n independent observations on P, that is, Pn puts probability 1=n on each

of the observations. Let X1 and ~XX denote the first and the (r � 1) last components of X :Similarly, let b1 and b�1 the first and the (r � 1) last components of b and let bn1 be the

382 T. GØRGENS

Dow

nloa

ded

by [

The

UC

Irv

ine

Lib

rari

es]

at 0

2:15

03

Nov

embe

r 20

14

first component of bn and bn,�1 the vector of remaining components. (Throughout b1 denotes

the first component of b, not the first vector in the sequence {bn}.)

The first assumption ensures identification.

ASSUMPTION 1 The real function L is defined on a possibly unbounded interval IL and is

strictly increasing. The random vector (X 0, Y , U , C)0 satisfies L(Y ) ¼ min [b0X þ U , C],

and M ¼ 1 if b0X þ U � C and M ¼ 0 otherwise. Furthermore, U and X are independent,

and C and U are conditionally independent given X. The distribution function C of U is

right-continuous, X1 is absolutely continuous with respect to Lebesgue measure conditional

on ~XX , and jb1j ¼ 1.

In case II, it is also assumed that Y � 0.

When X1 is absolutely continuous conditional on ~XX , so is Z. Let x(� j ~xx) denote the con-

ditional density of Z given ~XX ¼ ~xx.

The second assumption characterizes the sample.

ASSUMPTION 2 The sequence {(X 0i , Yi, Mi)

0}ni¼1 is a random sample from P.

The next assumption concerns the estimator of b. All the estimators of b mentioned in the

introduction satisfy this assumption. If b is known, the assumption is not needed.

ASSUMPTION 3 bn1 ¼ b1. There is a function O: Rrþ2 ! Rr�1 such that PO ¼ 0, the

components of POO0 are finite, and n1=2(bn,�1 � b�1) ¼ n1=2PnOþ op(1) as n ! 1.

The random vector ~XX is bounded with probability one.

Let Y(� j x) denote the conditional distribution function of C given X ¼ x, and let �cc be

the largest number such that Y(c j x) is continuous for all c < �cc. Under type I censorship,

Y(c j x) ¼ 0 for c < �cc and Y(c j x) ¼ 1 for c � �cc. For uncensored data and data where C

is absolutely continuous, �cc ¼ 1.

The derivation of the limiting distributions depends on applications of the mean value the-

orem and Taylor expansions. Hence, the underlying functions must be sufficiently smooth.

The requirements are listed as Assumption 4. For simplicity of exposition, Assumption 4

requires that most derivatives exist everywhere on the domain of the original functions.

The results of Theorems 1 and 2 may still hold even if a function is not differentiable

everywhere, provided TL, TC, WL, and WC are chosen to avoid ‘‘edge-effects’’ in the

kernel smoothing. That is, if the kernel estimates involve smoothing over Zi ¼ b0Xi near z

then F( y j �), F1( y j �) and �FF2( y j �) must be smooth on [z� kzn, zþ kzn] for all n large.

Here and throughout let Dji f denote the jth order partial derivative with respect to the ith

argument of f (counting each component of vector arguments separately), and let Df , Di f ,

and Djf be short for D11 f , D1

i f , and Dj1 f , respectively.

ASSUMPTION 4 There is an integer kz � 4 such that:

1. The derivatives DL and DL2 exist and are bounded and continuous on all compact

subsets of IL.

2. The derivatives DC, . . . , Dkzþ2C exist and are bounded and continuous on R.

3. The density z is bounded, and the derivative Dz exists and is bounded and continuous on R.

4. The derivatives D1x(� j ~XX ), . . . , Dkzþ11 x(� j ~XX ) exist and are bounded and continuous on R

for almost all ~XX .

5. The derivatives D2Y, . . . , Dkzþ12 Y and D2D1Y, . . . , Dkzþ1

2 D1Y exist and are bounded and

continuous on {(c, x) 2 R1þr: c < �cc}.


Dow

nloa

ded

by [

The

UC

Irv

ine

Lib

rari

es]

at 0

2:15

03

Nov

embe

r 20

14

It can be shown that the joint subdensity of (Y , Z, M ) conditional on ~XX is

p( y, z, m j ~xx) ¼

D1Y(L( y)jx(z, ~xx))DL( y)[1 �C(L( y) � z)]x(zj~xx) if m ¼ 0

[1 �Y(L( y)jx(z, ~xx))]DC(L( y) � z)DL( y)x(zj~xx) if m ¼ 1,

8<: (21)

where x(z, ~xx) denotes the vector ((z� ~bb0 ~xx)=b1, ~xx0)0. Assumption 4 ensures among other things

that p has kz þ 1 bounded and continuous derivatives wrt z. This is used to bound remainder

terms in the asymptotic expansions of n1=2(Ln � L) and n1=2(Cn �C).

As mentioned earlier, C is allowed to depend on X . However, to ensure that p( y, z, m j ~xx)

is a smooth function of z, the relationship between C and X1 must be smooth. The precise

condition is stated in Assumption 4.5. If C does not depend on X1, then 4.5 is automatically

satisfied, and of course 4.5 is redundant for uncensored data since Y ¼ 0.

A researcher who wish to use the new estimators must choose a bandwidth, a kernel

function and two weight functions. To establish consistency and asymptotic normality, it

is necessary to restrict the choices. Sufficient conditions are given in the remaining

assumptions.

ASSUMPTION 5 The bandwidth kzn is a sequence of positive real numbers converging to 0

such that n1=2kkzzn ! 0 and n�1k�6zn ! 0, where kz is determined in Assumption 4.

Assumption 5 is satisfied, for example, if kz ¼ 4 and kzn / n�1=7.

ASSUMPTION 6 The kernel Kz is a bounded, integrable real function on R which vanishes

outside [ � 1, 1],Ð 1

�1Kz(z) dz ¼ 1 and

Ð 1

�1Kz(z)z

j dz ¼ 0 for j ¼ 1, . . . , kz � 1, where kz is

determined in Assumption 4. Furthermore, Kz has a derivative DKz which satisfies the

Lipschitz condition jDKz(z) � DKz(z�)j � cK jz� z�j for some constant cK and all z, z� 2 R.

Assumption 7 concerns the estimation set TL and the weight function WL. Note that the

normalization constant t0 must be an element of TL, whereas L0 can be chosen freely.

Choosing L(t0) ¼ L0 is equivalent to setting the median of U equal to L0 � G(t0, 1=2).

Recall that IY denotes the support of Y .

ASSUMPTION 7

1. TL is a bounded interval in IY and t0 2 TL.

2. WL is a real bounded function on R2, and WL( � , t) integrates to 1 for all t 2 TL.

3. If WL(q, t) 6¼ 0 then z(G(t, q)) > cz and z(G(t0, q)) > cz for some cz > 0.

4. WL( � , t) vanishes outside [q0, q1] with 0 < q0 < q1 < 1 for all t 2 TL,

5. WL satisfies the Lipschitz conditions jWL(q, t) �WL(q�, t)j � cW jq� q�j and

jWL(q, t) �WL(q, t�)j � cW jt � t�j for all q, q� 2 [0, 1] and all t, t� 2 TL, where cW is

some constant.

Assumption 8 concerns the estimation set TC and the weight function WC.

ASSUMPTION 8

1. TC is a bounded interval in R.

2. WC is a real bounded function on R2, and WC( � , u) integrates to 1 for all u 2 TC.

384 T. GØRGENS

Dow

nloa

ded

by [

The

UC

Irv

ine

Lib

rari

es]

at 0

2:15

03

Nov

embe

r 20

14

3. There is a constant cz > 0 such that if u 2 TC and WC( y, u) 6¼ 0, then y 2 TL and

z(L( y) � u) > cz.

4. WC satisfies the Lipschitz conditions jWC( y, u) �WC( y�, u)j � cW jy� y�j and

jWC( y, u) �WC( y, u�)j � cW ju� u�j for all y, y� 2 TL and all u, u� 2 TC, where cW is

some constant.

Additional notation is required before the theorem can be stated. Define

f I(t, z, y, m) ¼1( y � t) � F(t j z)

z(z)(22)

and

f II(t, z, y, m) ¼1 � F(t j z)

z(z)

1( y � t)m

�FF2( y j z)�

ðt0

1( y � v)D1F1(v j z)

�FF2(v j z)2dv

� �: (23)

If the data are uncensored or type I censored and nonnegative, then f I ¼ f II.

Let wL be the set of all bounded, real functions on TL and equip wL with the metric gen-

erated by the uniform norm and the s-algebra generated by closed balls. Let LIn and LII

n

denote the new estimators for case I and case II.

THEOREM 1 For j ¼ I , II define

F j

L(t, x, y, m) ¼WL(F(t j b0x), t)f j(t, b0x, y, m) �WL(F(t0 j b0x), t)f j(t0, b0x, y, m)

� O(x, y, m)0ð1

0

WL(q, t)(E( ~XX j Z ¼ G(q, t)) � E( ~XX jZ ¼ G(q, t0))) dq:

(24)

Under assumptions 1–7,

(a) Ljn can be uniformly approximated by the empirical process PnF

j

L, i.e.,

supt2TL

jLjn(t) � L(t) � PnF

j

L(t)j ¼ op(n�1=2), (25)

and PFj

L(t) ¼ 0 for all t 2 TL.

(b) The sequence { supTLjLj

n � Lj} converges to 0 in probability.

(c) The sequence {n1=2(Ljn � L)} of random elements of wL converges in distribution to a

Gaussian stochastic process on TL. The mean of the limiting process is zero and the

covariance function is PF j

L(t)F j

L(t�) for t, t� 2 TL.

Given the first conclusion of the theorem, the second and the third follow from standard

theorems on convergence of empirical processes. Specifically, Theorem 1(b) follows from

theorem II.24 of Pollard (1984) and Theorem 1(c) from lemma 2.16 of Pakes and Pollard

(1989) and theorem VII.21 of Pollard (1984). The proof of Theorem 1(a) employs empirical

process and U -process theory and can be found in the Appendix.

A similar theorem holds for Cn. Let wC be the set of all bounded, real functions on TC and

equip wC with the metric generated by the uniform norm and the s-algebra generated by

closed balls. Let CIn and CII

n be the new estimators for case I and case II.


Dow

nloa

ded

by [

The

UC

Irv

ine

Lib

rari

es]

at 0

2:15

03

Nov

embe

r 20

14

THEOREM 2 For j ¼ I , II define

F jC(t, x, y, m) ¼WC(L�1(b0xþ t), t)f j(L�1(b0xþ t), b0x, y, m)DL�1(b0xþ t)

� O(x, y, m)0ðIL

WC(v, t)D2F(v j L(v) � t)E( ~XX j Z ¼ L(v) � t) dv

þ

ðIL

WC(v, t)D2F(v j L(v) � t)F j

L(v, x, y, m) dv: (26)

Under Assumptions 1–8,

(a) C jn can be uniformly approximated by the empirical process PnF

jC, i.e.,

supt2TC

jC jn(t) �C(t) � PnF

jC(t)j ¼ op(n

�1=2), (27)

and PF jC(t) ¼ 0 for all t 2 TC.

(b) The sequence { supTCjCj

n �Cj} converges to 0 in probability.

(c) The sequence {n1=2(Cjn �C)} of random elements of wC converges in distribution to a

Gaussian stochastic process on TC. The mean of the limiting process is zero and the

covariance function is PF jC(t)F j

C(t�) for t, t� 2 TC.

The proof of Theorem 2(a) can be found in the Appendix. As for Theorem 1, Theorems 2(b)

and 2(c) follow from Theorem 2(a) and the work of Pollard (1984) and Pakes and Pollard

(1989).

Horowitz (1996) and Gørgens and Horowitz (1999) proved similar theorems under the same

kind of assumptions, but since their estimators are based on derivatives of F stronger smooth-

ness assumptions are required. For example, Horowitz assumed existence and boundedness

of the third order derivative of L, the seventh order derivatives of z and x, and the ninth

order derivative of C. In addition, his weight function W �L must be seven times differentiable.

4 MONTE CARLO RESULTS

In this section the new estimators are compared to previous semiparametric estimators in a set

of Monte Carlo experiments using the same designs as Horowitz (1996) for uncensored data

and Gørgens and Horowitz (1999) for data with 20% random censoring. The estimators are

evaluated by their ability to predict Y � given X ¼ x. As pointed out by Horowitz, the usual

predictor, the conditional expectation of Y � given X ¼ x, is not available because L and Care not estimated on their entire domains. Instead Horowitz suggested using the median, or

some other quantile, of the conditional distribution of Y � given X ¼ x. The conditional

median of Y � given X ¼ x is

Q(x) ¼ L�1 b0xþC�1 1

2

� �� : (28)

Horowitz’s estimator consists of replacing the unknown b, L, and C by their estimators, and

he proved (under regularity conditions) that the resulting estimator is uniformly consistent

and asymptotically normal for x 2 TQ, where TQ in a suitable bounded subset of Rr. His

theorem continues to hold if his estimators of L and C are replaced with the estimators

developed in this paper.

Details of the designs are reported in the first part of Table I. The results for the previous

estimators in the second part of Table I are quoted directly from the respective papers. Please

refer to the original sources for details concerning the specification of two weight functions,

386 T. GØRGENS

Dow

nloa

ded

by [

The

UC

Irv

ine

Lib

rari

es]

at 0

2:15

03

Nov

embe

r 20

14

two kernels, and two bandwidths. In all experiments the choice of weight functions and ker-

nels were ad hoc, whereas the bandwidths were chosen to minimize the mean integrated

square error (MISE) of Qn(x) for x 2 TQ.

For the new estimators WL, WC, Kz, and kzn are as follows. The support of WL( � , t) is the

interval [at, bt] where at ¼ max (C(L(t) � 3), C(L(t0) � 3)) and bt ¼ min (C(L(t) � 3),

C(L(t0) � 3)). This ensures that WL(F(tjz), t) and WL(F(t0jz), t) are both zero for z outside

[�3, 3]. The constant t0 is 0 in the Linear and the Sinh experiments, 1 in the Log and the

Weibull experiments, and 5 in the U-Shaped Hazard experiment. Given at and bt,

WL( � , t) is defined by

WL(q, t) ¼ K2q� (at þ bt)

bt � at

� �2

bt � at, (29)

where K is a uniform kernel on [ � 1, 1] in the Linear, the Log, and Sinh experiments, and K

is the second-order kernel

K(v) ¼15

16(1 � v2)21( � 1 � v � 1) (30)

in the Weibull and the U-Shaped Hazard experiments. Similarly, the support of WC( � , t) is

[at, bt] where at ¼ max (L�1( � 3 þ t þ L(t0) � L0), inf TL) and bt ¼ min (L�1(3 þ tþ

L(t0) � L0), sup TL), and on its support WC( � , t) is defined in the same manner as

WL( � , t). The kernel function is a fourth order kernel taken from Muller (1984),

Kz(z) ¼105

64(1 � 5z2 þ 7z4 � 3z6)1(jzj � 1): (31)

Finally, the bandwidth is 1.5, since this value approximately minimizes the MISE of Qn(x) for

x 2 TQ.

The second part of Table I shows that the new estimators perform better than the previous,

since the MISEs of the conditional median estimate are lower for the new estimators in all

TABLE I Comparison of Previous and New Estimators.

Experiment (Horowitz (1996)) Experiment (Gørgens and Horowitz (1999))

Linear Log Sinh Weibull U-shaped Hazard

DesignL(t) t þ mU ln (t) þ mU (1=13) sinh (2t) þ mU 2 ln (t) ln (0:6(t=5)13 þ 0:4(t=5)6)C(t) F(t � mU ) F(t � mU ) F(t � mU ) 1=(1 þ e�t) 1=(1 þ e�t)mU 2 2 2.0992 0 0Censoring 0% 0% 0% 20% 20%TL [�2, 2] [e�2, e2] [�2, 2] [0.2, 2.4] [0.2, 8.0]TC [0, 4] [0, 4] [0, 4] [�3, 1] [�2, 1.5]TQ [�2, 2] [�2, 2] [�2, 2] [�3, 1.5] [�3, 2.5]Sample size 100 100 100 400 400Samples 100 100 100 500 500

MISE of Qn on TQPrevious 0.065 0.517 0.131 0.012 0.233New 0.054 0.369 0.077 0.008 0.169

Note: F denotes the standard normal distribution function. In all experiments, X�N(0, 1), U�N(mU, 1), C�N(mC, 1), where mC ischosen to achieve the desired probability of censoring, and b is assumed known and equal to 1. MISE: mean integrated squared errorof estimates of the conditional median of Y* given X¼ x over x 2 TQ divided by the length of TQ.


Dow

nloa

ded

by [

The

UC

Irv

ine

Lib

rari

es]

at 0

2:15

03

Nov

embe

r 20

14

experiments. Of course, this does not prove that the new estimators will generate

predictions with lower MISE in other settings, nor does it imply that they perform better

than the previous if other performance measures are used. However, it does make the new

estimators an interesting and promising alternative to previous estimators.

5 CONCLUSION

New semiparametric estimators have been proposed for estimating L and C in a regression

model with an unknown transformation of the dependent variable. Estimators have been pro-

posed for both uncensored and right-censored data. The new estimators require only one ker-

nel function and one bandwidth and are therefore simpler to implement than the

semiparametric estimators proposed by Horowitz (1996) and by Gørgens and Horowitz

(1999), which require two of each. In addition, the new estimators perform better than the

previous semiparametric estimators in a small set of Monte Carlo experiments, where the

mean integrated square prediction errors were lower for the new estimators in all of five dif-

ferent designs. Further research is needed to determine in which situations the new estimators

are preferable to previous estimators and vice versa.

Acknowledgements

I thank Catherine de Fontenay and David Prentice for helpful comments on an earlier version

of the paper.

References

Ai, C. (1997). A semiparametric maximum likelihood estimator. Econometrica, 65(4), 933–963.Breslow, N. and Crowley, J. (1974). A large sample study of the life table and product limit estimates under random

censorship. Annals of Statistics, 2(3), 437–453.Gill, R. D. and Johansen, S. (1990). A survey of product-integration with a view towards application in survival

analysis. Annals of Statistics, 18(4), 1501–1555.Gørgens, T. (1999). Semiparametric estimation of single-index transition intensities. Discussion paper 99=25,

Institute of Economics, University of Copenhagen.Gørgens, T. and Horowitz, J. (1999). Semiparametric estimation of a censored regression model with an unknown

transformation of the dependent variable. Journal of Econometrics, 90(2), 155–191.Han, A. K. (1987). Non-parametric analysis of a generalized regression model. Journal of Econometrics, 35, 303–316.Hardle, W., Janssen, P. and Serfling, R. (1988). Strong uniform consistency rates for estimators of conditional

functionals. Annals of Statistics, 16(4), 1428–1449.Hardle, W. and Stoker, T. M. (1989). Investigating smooth multiple regression by the method of average derivatives.

Journal of the American Statistical Association, 84, 986–995.Heckman, J. J. and Singer, B. (1984). A method for minimizing the impact of distributional assumptions in

econometric models for duration data. Econometrica, 52, 271–320.Horowitz, J. L. (1996). Semiparametric estimation of a regression model with an unknown transformation of the

dependent variable. Econometrica, 64(1), 103–137.Horowitz, J. L. and Hardle, W. (1996). Direct semiparametric estimation of single-index models with discrete

covariates. Journal of the American Statistical Association, 91(436), 1632–1640.Ichimura, H. (1993). Semiparametric least squares (SLS) and weighted SLS estimation of single index models.

Journal of Econometrics, 58, 71–120.Kalbfleisch, J. D. and Prentice, R. L. (1980). The Statistical Analysis of Failure Time Data. Wiley, New York.Muller, H.-G. (1984). Smooth optimum kernel estimators of densities, regression curves and modes. Annals of

Statistics, 12, 766–774.Murphy, S. A. (1994). Asymptotic theory for the frailty model. Annals of Statistics, 23, 182–198.Murphy, S. A. (1995). Consistency in a proportional hazards model incorporating a random effect. Annals of

Statistics, 22, 712–731.Nielsen, G. G., Gill, R. D., Andersen, P. K. and Sørensen, T. I. A. (1992). A counting process approach to maximum

likelihood estimation in frailty models. Scandinavian Journal of Statistics, 19, 25–43.

388 T. GØRGENS

Dow

nloa

ded

by [

The

UC

Irv

ine

Lib

rari

es]

at 0

2:15

03

Nov

embe

r 20

14

Pakes, A. and Pollard, D. (1989). Simulation and the asymptotics of optimization estimators. Econometrica, 57, 1027–1057.

Pollard, D. (1984). Convergence of Stochastic Processes. Springer-Verlag, New York.Powell, J. L., Stock, J. H. and Stoker, T. M. (1989). Semiparametric estimation of index coefficients. Econometrica,

57(6), 1403–1430.Sherman, R. P. (1993). The limiting distribution of the maximum rank correlation estimator. Econometrica, 61(1),

123–137.Stone, C. J. (1980). Optimal rates of convergence for nonparametric estimators. Annals of Statistics, 8(6), 1348–1360.

APPENDIX: PROOF OF THEOREMS

Theorems l(a) and 2(a) can be proved using similar methods as Horowitz (1996) and Gørgens

and Horowitz (1999). Essentially, the idea is to linearize (in Kz) the estimators using the mean

value theorem or a Taylor expansion, and then apply empirical process and U-process theory

to show that the remainder terms vanish sufficiently quickly. This appendix contains outlines

of the proofs of Theorems l(a) and 2(a), concentrating on linearizing the estimators.

Convergence of the remainder terms follows from the lemmas of Gørgens and Horowitz

(1999) and Gørgens (1999) and references are given where appropriate.

Convergence of FIn, Hn, and FII

n

In this section let {zn} represent a sequence converging to z at the rate n1=2. In the proof of

Theorem 1 zn simply equals z, whereas in the proof of Theorem 2 z and zn will be replaced by

L(v) � t and Ln(v) � t.

Define S1 ¼ {z 2 R: z(z) > cz} and S2 ¼ TL � S1, then S1 and S2 are bounded and edge-

effects in the kernel estimates disappear asymptotically on S1 and S2, because by Assumption

4 z(zþ e), F( y j zþ e), F1( y j zþ e), and �FF2( y j zþ e) are smooth for n large whenever z 2

S1 or ( y, z) 2 S2 and e < kzn. By lemma 5 of Gørgens and Horowitz (1999), the assumptions

given in Section 3 imply

supz2S1

jzn(zn) � z(z)j ¼ op(n�1=4) (32)

and

sup(t,z)2S2

jAjn(t, zn) � Aj(t, z)j ¼ op(n�1=4), j ¼ 0, 1, (33)

where Aj(t, z) ¼ Fj(t j z)z(z). Uniform convergence of �AA2n follows from uniform convergence

of A0n.

As a matter of algebra,

F In(t j zn) � F(t j z) ¼ EI

n(t, z, zn) þ rI1n (t, z, zn), (34)

where

EIn(t, z, zn) ¼

1

n

Xni¼1

f I(t, z, Yi, Mi)Kni(zn) (35)

and

rIn(t, z, zn) ¼

F(t j z)(zn(zn) � z(z))2

z(z)zn(zn)�

(A0n(t, zn) � A0(t, z))(zn(zn) � z(z))

z(z)zn(zn): (36)


Dow

nloa

ded

by [

The

UC

Irv

ine

Lib

rari

es]

at 0

2:15

03

Nov

embe

r 20

14

Therefore,

sup(t,z)2S2

jF In(t j zn) � F(t j z)j ¼ op(n�1=4) (37)

and

sup(t,z)2S2

jF In(tjzn) � F(t j z) � EI

n(t, z, zn)j ¼ op(n�1=2) (38)

follow immediately from (32) and (33).

Following Breslow and Crowley (1974, p. 447), Hn � H has the expansion

Hn(t j zn) � H(t j z) ¼ EHn (t, z, zn) � rH1n(t, z, zn) þ rH2n(t, z, zn), (39)

where

EHn (t, z, zn) ¼

1

n

Xni¼1

1(Yi � t)Mi

�AA2(Yi, z)�

ðt0

1(Yi � v)A1(dv, v)

�AA2(v, z)2

� �Kni(zn), (40)

rH1n(t, z, zn) ¼

ðt0

( �AA2n(v, zn) � �AA2(v, z))(A1n(dv, zn) � A1(dv, z))

�AA2(v, z)2(41)

and

rH2n(t, z, zn) ¼

ðt0

( �AA2n(v, zn) � �AA2(v, z))2A1n(dv, zn)

�AA2n(v, zn) �AA2(v, z)2: (42)

Under the assumptions given in Section 3, lemma 5 of Gørgens and Horowitz (1999)

implies sup(t,z)2S2jEH

n (t, z, zn)j ¼ op(n�1=4), lemma 2 of Gørgens (1999) implies

sup(t, z)2S2j rH1n(t, z, zn)j ¼ op(n�1=2), and sup(t,z)2S2

j rH2n(t, z, zn)j ¼ op(n�1=2) follows from

Eq. (33). It follows that

sup(t,z)2S2

jHn(tjzn) � H(tjz)j ¼ op(n�1=4) (43)

and

sup(t,z)2S2

jHn(t j zn) � H(t j z) � EHn (t, z, zn)j ¼ op(n

�1=2): (44)

Assumption 4 implies that H is continuous, so F ¼ 1 � exp ( � H) by Eq. (16). Hence

F IIn (t j zn) � F(t j z) ¼ � exp ( ln (1 � F II

n (t j zn))) þ exp ( � H(t j z))

¼ exp ( � H(t j z))(Hn(t j zn) � H(t j z))

�1

2

� �exp ( � H�

n (t j zn))(Hn(t j zn) � H(t j z))2

� exp ( � H��n (t j zn))( ln (1 � F II

n (t j zn)) þ Hn(t j zn)), (45)

where H�n (t j zn) is between Hn(t j zn) and H(t j z), and H��

n (t j zn) is between

� ln (1 � jF IIn (t j zn)) and Hn(t j zn). An argument similar to the one given by Breslow

and Crowley (1974) for their lemma 1 shows that

sup(t,z)2S2

j ln (1 � F IIn (t j zn)) þ Hn(t j zn)j ¼ op(n

�1=2): (46)

390 T. GØRGENS

Dow

nloa

ded

by [

The

UC

Irv

ine

Lib

rari

es]

at 0

2:15

03

Nov

embe

r 20

14

It then follows from (43), (45), and (46) that

sup(t,z)2S2

jF IIn (t j zn) � F(t j z)j ¼ op(n

�1=4): (47)

Moreover,

F IIn (t j zn) � F(t j z) ¼ EII

n (t, z, zn) þ rIIn (t, z, zn), (48)

where

EIIn (t, z, zn) ¼

1

n

Xni¼1

f II(t, z, Yi, Mi)Kni(zn) (49)

and

rIIn (t, z, zn) ¼ � exp ( � H(t j z))(rH1n(t, z, zn) � rH2n(t, z, zn))

�1

2

� �exp ( � H�

n (t j zn))(Hn(t j zn) � H(t j z))2

� exp ( � H��n (t j zn))( ln (1 � F II

n (t j zn)) þ Hn(t j zn)): (50)

Therefore,

sup(t,z)2S2

jFIIn (t j zn) � F(t j z) � EII

n (t, z, zn)j ¼ op(n�1=2) (51)

follows from Eqs. (43), (44), and (46).

Proof of Theorem 1(a)

For j ¼ I, II define

fj

L(t, z, y, m) ¼ WL(F(t j z), t)f j(t, z, y, m) �WL(F(t0 j z), t) f j(t0, z, y, m): (52)

In the following the superscript j is suppressed. Under the assumptions given in Section 3,

with a minor modification lemma 10 in appendix A.2 of Gørgens and Horowitz (1999)

implies that

supt2TL

PnFL(t) �1

n

Xni¼1

ð1�1

fL(t, z, Yi, Mi)Kni(z) dz

�� ¼ op(n

�1=2): (53)

To prove Theorem 1(a) it remains to be shown that

supt2TL

Ln(t) � L(t) �1

n

Xni¼1

ð1�1

fL(t, z, Yi, Mi)Kni(z) dz

�� ¼ op(n�1=2): (54)

Define

Ln(t, t�) ¼

ð1

0

WL(q, t)(Gn(t�, q) � G(t�, q)) dq, (55)

then Ln(t) � L(t) ¼ Ln(t, t) � Ln(t, t0). By definition of fL and En

1

n

Xni¼1

fL(z, Yi, Mi, t)Kni(z) ¼ WL(F(t j z), t)En(t, z, z) �WL(F(t0 j z), t)En(t0, z, z): (56)


Dow

nloa

ded

by [

The

UC

Irv

ine

Lib

rari

es]

at 0

2:15

03

Nov

embe

r 20

14

Therefore, (54) follows if

sup(t,t�)2TL�TL

Ln(t, t�) �

ð1�1

WL(F(t� j z), t)En(t�, z, z) dz

�� ¼ op(n�1=2): (57)

Define J (q, t) ¼Ð q

0WL(V , t) dv. By integration by parts (using the assumption that WL( � , t)

vanishes outside [q0, q1]) and a change of variables,

Ln(t, t�) ¼

ð1�1

(J (Fn(t� j z), t) � J (F(t� j z), t)) dz: (58)

(This result is similar to Hardle et al., 1988, p. 1438.) Now the mean value theorem implies

Ln(t, t�) ¼

ð1�1

WL(F(t� j z), t)(Fn(t� j z) � F(t� j z)) dzþ R1n(t, t

�), (59)

where

R1n(t, t�) ¼

ð1�1

(WL(F�n (t� j z), t) �WL(F(t� j z), t))(Fn(t

� j z) � F(t� j z)) dz (60)

and F�n (t� j z) is between Fn(t

� j z) and F(t� j z). Since D2WL is bounded by cW, say, and

WL( � , t) has bounded support,ð1�1

jD1WL(F�n (t� j z), t)j dz � cW (G(t, q1) � G(t, q0) þ 2e) (61)

for all sufficiently large n and any e > 0. By Lipschitz continuity of WL( � , t) and Eq. (37) or

(47), it therefore follows that sup(t,t�)2TL�TLjR1n(t, t�)j ¼ op(n�1=2). Substituting from (34) or

(48) gives

Ln(t, t�) ¼

ð1�1

WL(F(t�jz), t)En(t�, z, z) dzþ R1n(t, t�) þ R2n(t, t�), (62)

where

R2n(t, t�) ¼

ð1�1

WL(F(t�jz), t)rn(t, z, z) dz: (63)

Since WL is bounded, WL( � , t) has bounded support, and since WL(F(L�jz), t) 6¼ 0 implies

z(z) > cz, it follows that sup(t,t�)2TL�TLjR2n(t, t�)j ¼ op(n�1=2), and Eq. (57) is proved.

Proof of Theorem 2(a)

The proof of Theorem 2(a) is simpler than Theorem l(a), because the weight function only

appears once in the formula for Cn and because the dependence of Cn on Ln is already

taken into account in lemma 10 of Gørgens and Horowitz (1999). For j ¼ I, II define

fjC(t, v, y, m) ¼ WC(v, t)f j(v, L(v) � t, y, m): (64)

Suppressing the superscript j indicating estimator I or II, lemma 10 in appendix A.2 of

Gørgens and Horowitz (1999) implies

supt2TC

PnFC �1

n

Xni¼1

ðTL

fC(t, v, Yi, Mi)Kni(Ln(v) � t) dv

�� ¼ op(n

�1=2): (65)

392 T. GØRGENS

Dow

nloa

ded

by [

The

UC

Irv

ine

Lib

rari

es]

at 0

2:15

03

Nov

embe

r 20

14

Therefore, it remains to be shown that

supt2TC

Cn(t) �C(t) �1

n

Xni¼1

ðTL

fC(t, v, Yi, Mi)Kni(Ln(v) � t) dv

�� ¼ op(n

�1=2): (66)

By the approximation (34) or (48),

Cn(t) �C(t) ¼

ðTL

WC(v, t)(Fn(vjLn(v) � t) � F(vjL(v) � t)) dv

¼1

n

Xni¼1

ðTL

fC(t, v, Yi, Mi)Kni(Ln(v) � t) dvþ R3n(t), (67)

where

R3n(t) ¼

ðTL

WC(v, t)rn(v, L(v) � t, Ln(v) � t) dv: (68)

Since the support of WC( � , t) is bounded, uniform convergence of rn (Eq. (38) or (51))

implies supt2TCjR3n(t)j ¼ op(n�1=2). Eq. (66) and hence Theorem 2(a) follows.


Dow

nloa

ded

by [

The

UC

Irv

ine

Lib

rari

es]

at 0

2:15

03

Nov

embe

r 20

14

Documents

Semiparametric estimation of censored transformation models