Upload
tue
View
213
Download
1
Embed Size (px)
Citation preview
This article was downloaded by: [The UC Irvine Libraries]On: 03 November 2014, At: 02:15Publisher: Taylor & FrancisInforma Ltd Registered in England and Wales Registered Number: 1072954 Registeredoffice: Mortimer House, 37-41 Mortimer Street, London W1T 3JH, UK
Journal of Nonparametric StatisticsPublication details, including instructions for authors andsubscription information:http://www.tandfonline.com/loi/gnst20
Semiparametric estimation of censoredtransformation modelsTue GØrgens aa Economics RSSS , Australian National University , Canberra, ACT,0200, AustraliaPublished online: 27 Oct 2010.
To cite this article: Tue GØrgens (2003) Semiparametric estimation of censored transformationmodels, Journal of Nonparametric Statistics, 15:3, 377-393, DOI: 10.1080/1048525031000120224
To link to this article: http://dx.doi.org/10.1080/1048525031000120224
PLEASE SCROLL DOWN FOR ARTICLE
Taylor & Francis makes every effort to ensure the accuracy of all the information (the“Content”) contained in the publications on our platform. However, Taylor & Francis,our agents, and our licensors make no representations or warranties whatsoever as tothe accuracy, completeness, or suitability for any purpose of the Content. Any opinionsand views expressed in this publication are the opinions and views of the authors,and are not the views of or endorsed by Taylor & Francis. The accuracy of the Contentshould not be relied upon and should be independently verified with primary sourcesof information. Taylor and Francis shall not be liable for any losses, actions, claims,proceedings, demands, costs, expenses, damages, and other liabilities whatsoever orhowsoever caused arising directly or indirectly in connection with, in relation to or arisingout of the use of the Content.
This article may be used for research, teaching, and private study purposes. Anysubstantial or systematic reproduction, redistribution, reselling, loan, sub-licensing,systematic supply, or distribution in any form to anyone is expressly forbidden. Terms &Conditions of access and use can be found at http://www.tandfonline.com/page/terms-and-conditions
Nonparametric Statistics, Vol. 15(3), June 2003, pp. 377–393
SEMIPARAMETRIC ESTIMATION OF CENSOREDTRANSFORMATION MODELS
TUE GØRGENS*
Economics RSSS, Australian National University, Canberra ACT 0200, Australia
(Received March 2002; Revised January 2003; In final form January 2003)
Many widely used models, including proportional hazards models with unobserved heterogeneity, can be written inthe form L(Y ) ¼ min [b0X þ U , C], where L is an unknown increasing function, the error term U has unknowndistribution function C and is independent of X , C is a random censoring threshold and U and C areconditionally independent given X. This paper develops new n1=2-consistent and asymptotically normalsemiparametric estimators of L and C which are easier to use than previous estimators. Moreover, Monte Carloresults suggest that the mean integrated squared error of predictions based on the new estimators is lower thanfor previous estimators.
Keywords: Semiparametric estimation; Kernel regression; Transformation model; Unobserved heterogeneity;Duration analysis; Censoring
JEL Classification: C13, C14, C24, C41
1 INTRODUCTION
Many widely used models, including for example loglinear models, accelerated failure-time
models, Box-Cox models, proportional hazards models, and mixed proportional hazards
models, can be written in the form L(Y ) ¼ b0X þ U or, for right-censored data,
L(Y ) ¼ min [b0X þ U , C], (1)
where L is an unknown strictly increasing real function, X is a random vector of explanatory
variables, b is an unknown vector of parameters, U is an unobserved stochastic disturbance
term, and C is a random censoring threshold. Throughout the paper the uncensored transfor-
mation model will be treated as a special case of (1) where C ¼ 1 with probability 1. The
following assumptions are standard in the literature and maintained throughout the paper: U
is independent of X , U and C are conditionally independent given X, and the distribution of
the index b0X is absolutely continuous. Let C denote the distribution function of U.
Heckman and Singer (1984) proposed a semiparametric estimator of C assuming a
parametric specification of L. Murphy (1994; 1995) and Nielsen et al. (1992) developed
* E-mail: [email protected]
ISSN 1048-5252 print; ISSN 1029-0311 online # 2003 Taylor & Francis LtdDOI: 10.1080=1048525031000120224
Dow
nloa
ded
by [
The
UC
Irv
ine
Lib
rari
es]
at 0
2:15
03
Nov
embe
r 20
14
semiparametric estimators of L assuming a parametric specification of C. Semiparametric
estimators of L and C, which do not assume that either L or C belong to known finite-
dimensional parametric families of functions, were first developed by Horowitz (1996)
for uncensored data and extended to censored data by Gørgens and Horowitz (1999).
These estimators are n1=2-consistent and asymptotically normal, and they perform well in
Monte Carlo experiments. However, their practical use is hampered by their complexity of
implementation as they require the researcher to choose weight functions, kernel functions,
and bandwidths.
This paper proposes new semiparametric estimators of L and C, which are easier to
implement because they require only one kernel and one bandwidth whereas the previous
estimators require two of each. In applications, bandwidths are often determined by cross-
validation methods or by visual inspection of the estimates. In either case, it is necessary to
evaluate the estimator at many different bandwidths. Reducing the number of bandwidth
parameters from two to one simplifies application substantially. Thus the paper takes a sig-
nificant step towards making semiparametric estimation of transformation models practical
in empirical applications. (GAUSS programs are available from the author upon request.)
Moreover, the new estimators perform better than previous estimators in a small set of
Monte Carlo experiments, in the sense that they lead to predictions of Y which have
lower mean integrated squared errors.
The censoring scheme assumed here is so-called random censorship, where C is random
and conditionally independent of U given X. Random censoring is often encountered in prac-
tice. An example is duration studies in which spells begin at random times and the termina-
tion date of the study is predetermined. Other special cases of random censorship include
type I or ‘‘time’’ censoring, where C ¼ �cc for some constant �cc, and the case of no censoring,
where C ¼ 1. Examples of censoring schemes not included are type II or ‘‘order statistic’’
censoring, where sampling continues until a predetermined number of uncensored observa-
tions has been collected, and schemes where censoring depends on the value of Y. Note that
some censoring schemes may be reduced to type I censoring by introducing additional (arti-
ficial) censoring.
Both the previous and the new estimators require that b be estimated before L and C. No
new estimator of b is proposed in this paper. Model (1) is a single-index model if C is inde-
pendent of X, which includes the uncensored transformation case, or if C depends on X only
through the index b0X . Methods for estimating b (up to scale) in single-index models have
been devised by Han (1987), Hardle and Stoker (1989), Horowitz and Hardle (1996),
Ichimura (1993), Powell et al. (1989), Sherman (1993), and Ai (1997). An estimator of bwhen C may depend on X other than through b0X was proposed by Gørgens (1999).
Currently this is the only estimator available for the general case.
The paper is organized as follows. The estimators are developed in Section 2, and their
limiting distributions are discussed in Section 3. Monte Carlo results are presented in
Section 4, and Section 5 concludes. Proofs of theorems are relegated to the Appendix.
2 ESTIMATORS
The data available for estimation are assumed to consist of independent observations of X , Y ,
and an indicator of no censoring,
M ¼1 if X 0bþ U � C
0 otherwise.
�(2)
378 T. GØRGENS
Dow
nloa
ded
by [
The
UC
Irv
ine
Lib
rari
es]
at 0
2:15
03
Nov
embe
r 20
14
Scale and location normalizations are needed to identify the unknowns b, L, and C.
Identification of b (up to a scale parameter) requires that X have at least one component
which is continuously distributed conditional on the others and whose coefficient is nonzero
(see for instance Ichimura, 1993). Let this be the first component of X. For scale normaliza-
tion, let the first component of b be either 1 or �1, and normalize the location of the model
by setting L(t0) ¼ L0, where t0 and L0 are arbitrary constants (t0 must be in the interval on
which L is estimated, see below).
Similarly to Horowitz’s (1996) and Gørgens and Horowitz’s (1999) estimators, the new
estimators are based on the analog principle. The first step in constructing the estimators
is to express L and C in terms of the density z of Z ¼ b0X and the conditional distribution
function F of the uncensored Y given Z. The second step consists of replacing z and F in
these expressions by appropriate kernel estimators. What distinguishes the new from the pre-
vious estimators is the choice of estimating expressions in the first step: the new estimators
are based on quantiles whereas the previous estimators are based on derivatives. This means
that the estimator of F( y j z) in the second step need not be differentiable with respect to y
(although differentiability with respect to z is still required in order to eliminate asymptotic
bias). Consequently, no smoothing in the y-direction is needed and therefore the new estima-
tors require only one bandwidth and kernel function, whereas previous estimators require two
of each.
Estimators are proposed for two cases. In case I the data are assumed to be either uncen-
sored or type I censored, but the range of Y is unrestricted. This is similar to Horowitz (1996).
In case II Y must be nonnegative, but a general random censoring mechanism is allowed. This
case is relevant for many studies of duration data and was considered by Gørgens and
Horowitz (1999). The proposed new estimators coincide with each other when the two
cases overlap, that is, when the data are uncensored or type I censored and Y is nonnegative.
The estimating expressions for L and C are derived in Section 2.1, estimation of z and F is
discussed in Section 2.2 for case I and Section 2.3 for case II.
2.1 Estimating Equations
Define Z ¼ b0X and define the uncensored, latent variable Y � by L(Y �) ¼ Z þ U. Let zdenote the density of Z, and define F(t j z) ¼ Pr (Y � � t j Z ¼ z). In this section L and Care expressed as functionals of z and F.
Let IY be the support of Y. If the data are type I censored at C ¼ �cc, then �yy ¼ L�1(�cc) is an
upper bound of IY . If the data are uncensored, IY may be all of R. Assuming that L is strictly
increasing and that U and X are independent, then for t in IY
F(t j z) ¼ C(L(t) � z), (3)
where C is the distribution function of U. (All expressions conditional on Z ¼ z are valid for
z such that z(z) > 0 unless otherwise stated.) Let C�1(q) ¼ inf {u 2 R: C(u) � q} be the qth
quantile of U, and define the generalized inverse of F(t j �) by
G(t, q) ¼ sup {z 2 R: z(z) > 0, F(t j z) � q}
¼ sup {z 2 R: z(z) > 0, z � L(t) �C�1(q)}: (4)
The set on the right-hand side of (4) may be empty, in which case G(t, q) is infinite. The
second line shows that G(t, q) is well-defined and equals L(t) �C�1(q) if
z(L(t) �C�1(q)) > 0. To write this compactly, for any e � 0 define the set
SD(e) ¼ {(t, q) 2 IY � (0, 1): z(L(t) �C�1(q)) > e}: (5)
CENSORED TRANSFORMATION MODEL ESTIMATION 379
Dow
nloa
ded
by [
The
UC
Irv
ine
Lib
rari
es]
at 0
2:15
03
Nov
embe
r 20
14
Then G(t, q) ¼ L(t) �C�1(q) for (t, q) 2 SD(0), and it follows that
L(t) ¼ L0 þ G(t, q) � G(t0, q), (t, q), (t0, q) 2 SD(0): (6)
Given t0 and q, Eqs. (4) and (6) determine L as a function of z and F. Essentially, the new
estimator of L is obtained by replacing z and F in (4) by sample analogs and substituting the
resulting estimator Gn for G in (6).
Estimation of z and F is discussed in Sections 2.2 and 2.3. Given estimators zn and Fn,
estimate G by
Gn(t, q) ¼ sup {z 2 R: zn(z) > 0, Fn(t j z) � q}: (7)
Under fairly weak conditions these estimators converge to their population counterparts, and
hence L0 þ Gn(t, q) � Gn(t0, q) is a consistent estimator of L(t). However, the optimal rate
of convergence of zn, Fn, and hence Gn, depends on the smoothness of L, C, and z, but n1=2-
convergence cannot be achieved (see for instance Stone, 1980; Hardle et al., 1988).
Therefore, for any given value of q, simply replacing G(t, q) by Gn(t, q) in (6) will not result
in a n1=2-consistent estimator of L(t). Fortunately, n1=2-consistency can be obtained by aver-
aging over a range of q-values. Since the estimators of F involve random denominators
(essentially 1=zn), they converge uniformly only on sets of the form {(t, z) 2 R2: z(z) >cz}, where cz > 0 is a small constant. Therefore, Gn is uniformly consistent only on
SD(cz), as opposed to SD(0). This implies that averaging should take place only over values
of q such that z(L(t) �C�1(q)) > cz.
Now to define the new estimator of L, choose an interval TL � IY on which L is to be
estimated, and let WL be a weight function such that WL( � , t) is supported on (0, 1) and inte-
grates to 1 for all t 2 TL. Furthermore, the support of WL must be chosen to ensure that (t, q)
and (t0, q) are in SD(cz) whenever t 2 TL and WL(q, t) 6¼ 0. In other words, WL(q, t) 6¼ 0
should imply z(G(t, q)) > cz and z(G(t0, q)) > cz. An example of a weight function is
given in the Monte Carlo section. Define the new estimator Ln by
Ln(t) ¼ L0 þ
ð1
0
WL(q, t)(Gn(t, q) � Gn(t0, q)) dq: (8)
Under assumptions stated in Section 3, Ln is uniformly n1=2-consistent on TL and asympto-
tically normally distributed.
Turning now to estimation of C, it follows from Eq. (3) that
C(u) ¼ F( y j L( y) � u), ( y, u): z(L( y) � u) > 0: (9)
Once F and L are estimated, Eq. (9) can be used to estimate C. More specifically, to estimate
C, choose an interval TC � R on which C is to be estimated, and let WC be a weight function
such that WC( � , u) integrates to 1 for each u 2 TC. In addition, since L is only estimated
on TL and Fn(t j z) is consistent only if z(z) > cz, the support of WC( � , u) must be
chosen such that WC( y, u) 6¼ 0 implies y 2 TL and z(L( y) � u) > cz. Define the new
estimator of C by
Cn(u) ¼
ðTL
WC( y, u)Fn( y j Ln( y) � u) dy: (10)
It is shown in Sec. 3 that Cn is n1=2-consistent and asymptotically normal.
This completes the first step in developing the new estimators: Eqs. (7), (8) and (10) give
estimators Ln and Cn in terms of zn and Fn. The second step is to estimate z and F, which is
discussed in the next two subsections.
380 T. GØRGENS
Dow
nloa
ded
by [
The
UC
Irv
ine
Lib
rari
es]
at 0
2:15
03
Nov
embe
r 20
14
2.2 Estimating F: Case I
In the case of uncensored data or type I censored data, z and F can be estimated by the
Rosenblatt-Parzen and the Nadaraya-Watson kernel estimators. Let {(Xi, Yi, Mi)}ni¼1 be a ran-
dom sample. Choose an estimator bn, say, from the list given in the introduction. Choose also a
bandwidth kzn and a kernel function Kz; that is, let kzn be a positive real number and let Kz be a
function which integrates to 1. To achieve the bias reduction necessary to attain n1=2-conver-
gence, the kernel function must be of fourth order and twice differentiable. Other (standard)
regularity conditions that bn, kzn, and Kz must satisfy are given in Section 3. Define
Kni(z) ¼1
kznKz
b0nXi � z
kzn
� �: (11)
Let 1(A) be the indicator function of the event A. The Nadaraya-Watson estimator of F(t j z) is
FIn(t j z) ¼
A0n(t, z)
zn(z), (12)
where
A0n(t, z) ¼1
n
Xni¼1
Kni(z)1(Yi � t) (13)
and zn is the Rosenblatt-Parzen density estimator,
zn(z) ¼1
n
Xni¼1
Kni(z): (14)
The function F In(� j z) is a right-continuous step-function which jumps (up or down) by Kni at
each Yi, whereas F In(t j �) is a continuous and differentiable function if Kz is.
2.3 Estimating F: Case II
Now consider the case of general random censoring, but assume that Y is nonnegative. The
Rosenblatt-Parzen estimator of the density z of Z is unaffected by the censoring of Y , but
with random censoring the Nadaraya-Watson estimator of F is inconsistent. Fortunately
another estimator is available, namely, the conditional product-limit estimator (also known
as the conditional Kaplan-Meier estimator).
The following results are standard in duration analysis (see for instance Kalbfieisch and
Prentice, 1980; Gill and Johansen, 1990). Define F1(t j z) ¼ Pr(Y � t, M ¼ 1 j Z ¼ z) and�FF2(t j z) ¼ Pr(Y � t j Z ¼ z). Assuming that C and U are conditionally independent given
X , the conditional integrated hazard function H(� j z) of Y � given Z ¼ z is (using the
Stieltjes integral)
H(t j z) ¼
ðt0
F1(dy j z)
�FF2( y j z), (15)
and the conditional distribution function of Y � given Z ¼ z is
F(t j z) ¼ 1 � exp ( � Hc(t j z))Yy2DH
(1 � Hd( y j z))1( y�t), (16)
where Hc(� j z) is the continuous component of H(� j z) and Hd( y j z) ¼ H( y j z)�
H( y� j z), and DH denotes the set of discontinuity points of H .
CENSORED TRANSFORMATION MODEL ESTIMATION 381
Dow
nloa
ded
by [
The
UC
Irv
ine
Lib
rari
es]
at 0
2:15
03
Nov
embe
r 20
14
Let {(Xi, Yi, Mi)}ni¼1, bn, kzn, and Kz be as in Section 2.2. Define the Nadaraya-Watson
type estimators F1n(t j z) ¼ A1n(t, z)=zn(z) and �FF2n(t j z) ¼ �AA2n(t, z)=zn(z), where
A1n(t, z) ¼1
n
Xni¼1
Kni(z)1(Yi � t)Mi (17)
and
�AA2n(t, z) ¼1
n
Xni¼1
Kni(z)1(Yi � t): (18)
Given zn, A1n, and �AA2n, estimate H by replacing F1 and �FF2 in (15) by F1n and �FF2n. Since
F1n(� j z) is a right-continuous step function and the jump points occur at the uncensored
Yis, the expression simplifies as follows (assuming no ties in the last line)
Hn(t j z) ¼
ðt0
F1n(dy j z)
�FF2n( y j z)
¼Xni¼1
1(Yi � t)n�1Kni(z)Mi
�AA2n(Yi, z): (19)
Note that in the unconditional case with no Z-variable, Hn is the well-known Nelson-Aalen
estimator. Now replace H in (16) by Hn. Since Hn(� j z) is also a right-continuous step func-
tion with jump points at the uncensored Yis, the continuous component Hcn (� j z) is 0, so the
expression simplifies to (again assuming no ties)
F IIn (t j z) ¼ 1 �
Yni¼1
(1 � [Hn(Yi j z) � Hn(Y�i j z)])1(Yi�t)
¼ 1 �Yni¼1
1 � 1(Yi � t)n�1Kni(z)Mi
�AA2n(Yi, z)
� �: (20)
As in the case of F In, the function F II
n (� j z) is a right-continuous step-function with jump
points at discontinuity points of A1n, namely, at the uncensored Yis, and F IIn (t j �) is a
continuous and differentiable function if Kz is. In the unconditional case, F IIn is simply the
Kaplan-Meier estimator. In case of uncensored or type I censored data, F IIn is identical to
F In, and therefore the case I and case II estimators of L and C coincide.
3 ASYMPTOTIC DISTRIBUTIONS
Under conditions given in the assumptions below, the new estimators are n1=2-consistent and
asymptotically normally distributed. The precise results are stated in Theorems 1 and 2
below. The assumptions and results are similar for cases I and II, and it is convenient to
discuss them together.
Let P denote the distribution of (X 0, Y , M )0, and let Pn denote the empirical measure
formed from the n independent observations on P, that is, Pn puts probability 1=n on each
of the observations. Let X1 and ~XX denote the first and the (r � 1) last components of X :Similarly, let b1 and b�1 the first and the (r � 1) last components of b and let bn1 be the
382 T. GØRGENS
Dow
nloa
ded
by [
The
UC
Irv
ine
Lib
rari
es]
at 0
2:15
03
Nov
embe
r 20
14
first component of bn and bn,�1 the vector of remaining components. (Throughout b1 denotes
the first component of b, not the first vector in the sequence {bn}.)
The first assumption ensures identification.
ASSUMPTION 1 The real function L is defined on a possibly unbounded interval IL and is
strictly increasing. The random vector (X 0, Y , U , C)0 satisfies L(Y ) ¼ min [b0X þ U , C],
and M ¼ 1 if b0X þ U � C and M ¼ 0 otherwise. Furthermore, U and X are independent,
and C and U are conditionally independent given X. The distribution function C of U is
right-continuous, X1 is absolutely continuous with respect to Lebesgue measure conditional
on ~XX , and jb1j ¼ 1.
In case II, it is also assumed that Y � 0.
When X1 is absolutely continuous conditional on ~XX , so is Z. Let x(� j ~xx) denote the con-
ditional density of Z given ~XX ¼ ~xx.
The second assumption characterizes the sample.
ASSUMPTION 2 The sequence {(X 0i , Yi, Mi)
0}ni¼1 is a random sample from P.
The next assumption concerns the estimator of b. All the estimators of b mentioned in the
introduction satisfy this assumption. If b is known, the assumption is not needed.
ASSUMPTION 3 bn1 ¼ b1. There is a function O: Rrþ2 ! Rr�1 such that PO ¼ 0, the
components of POO0 are finite, and n1=2(bn,�1 � b�1) ¼ n1=2PnOþ op(1) as n ! 1.
The random vector ~XX is bounded with probability one.
Let Y(� j x) denote the conditional distribution function of C given X ¼ x, and let �cc be
the largest number such that Y(c j x) is continuous for all c < �cc. Under type I censorship,
Y(c j x) ¼ 0 for c < �cc and Y(c j x) ¼ 1 for c � �cc. For uncensored data and data where C
is absolutely continuous, �cc ¼ 1.
The derivation of the limiting distributions depends on applications of the mean value the-
orem and Taylor expansions. Hence, the underlying functions must be sufficiently smooth.
The requirements are listed as Assumption 4. For simplicity of exposition, Assumption 4
requires that most derivatives exist everywhere on the domain of the original functions.
The results of Theorems 1 and 2 may still hold even if a function is not differentiable
everywhere, provided TL, TC, WL, and WC are chosen to avoid ‘‘edge-effects’’ in the
kernel smoothing. That is, if the kernel estimates involve smoothing over Zi ¼ b0Xi near z
then F( y j �), F1( y j �) and �FF2( y j �) must be smooth on [z� kzn, zþ kzn] for all n large.
Here and throughout let Dji f denote the jth order partial derivative with respect to the ith
argument of f (counting each component of vector arguments separately), and let Df , Di f ,
and Djf be short for D11 f , D1
i f , and Dj1 f , respectively.
ASSUMPTION 4 There is an integer kz � 4 such that:
1. The derivatives DL and DL2 exist and are bounded and continuous on all compact
subsets of IL.
2. The derivatives DC, . . . , Dkzþ2C exist and are bounded and continuous on R.
3. The density z is bounded, and the derivative Dz exists and is bounded and continuous on R.
4. The derivatives D1x(� j ~XX ), . . . , Dkzþ11 x(� j ~XX ) exist and are bounded and continuous on R
for almost all ~XX .
5. The derivatives D2Y, . . . , Dkzþ12 Y and D2D1Y, . . . , Dkzþ1
2 D1Y exist and are bounded and
continuous on {(c, x) 2 R1þr: c < �cc}.
CENSORED TRANSFORMATION MODEL ESTIMATION 383
Dow
nloa
ded
by [
The
UC
Irv
ine
Lib
rari
es]
at 0
2:15
03
Nov
embe
r 20
14
It can be shown that the joint subdensity of (Y , Z, M ) conditional on ~XX is
p( y, z, m j ~xx) ¼
D1Y(L( y)jx(z, ~xx))DL( y)[1 �C(L( y) � z)]x(zj~xx) if m ¼ 0
[1 �Y(L( y)jx(z, ~xx))]DC(L( y) � z)DL( y)x(zj~xx) if m ¼ 1,
8<: (21)
where x(z, ~xx) denotes the vector ((z� ~bb0 ~xx)=b1, ~xx0)0. Assumption 4 ensures among other things
that p has kz þ 1 bounded and continuous derivatives wrt z. This is used to bound remainder
terms in the asymptotic expansions of n1=2(Ln � L) and n1=2(Cn �C).
As mentioned earlier, C is allowed to depend on X . However, to ensure that p( y, z, m j ~xx)
is a smooth function of z, the relationship between C and X1 must be smooth. The precise
condition is stated in Assumption 4.5. If C does not depend on X1, then 4.5 is automatically
satisfied, and of course 4.5 is redundant for uncensored data since Y ¼ 0.
A researcher who wish to use the new estimators must choose a bandwidth, a kernel
function and two weight functions. To establish consistency and asymptotic normality, it
is necessary to restrict the choices. Sufficient conditions are given in the remaining
assumptions.
ASSUMPTION 5 The bandwidth kzn is a sequence of positive real numbers converging to 0
such that n1=2kkzzn ! 0 and n�1k�6zn ! 0, where kz is determined in Assumption 4.
Assumption 5 is satisfied, for example, if kz ¼ 4 and kzn / n�1=7.
ASSUMPTION 6 The kernel Kz is a bounded, integrable real function on R which vanishes
outside [ � 1, 1],Ð 1
�1Kz(z) dz ¼ 1 and
Ð 1
�1Kz(z)z
j dz ¼ 0 for j ¼ 1, . . . , kz � 1, where kz is
determined in Assumption 4. Furthermore, Kz has a derivative DKz which satisfies the
Lipschitz condition jDKz(z) � DKz(z�)j � cK jz� z�j for some constant cK and all z, z� 2 R.
Assumption 7 concerns the estimation set TL and the weight function WL. Note that the
normalization constant t0 must be an element of TL, whereas L0 can be chosen freely.
Choosing L(t0) ¼ L0 is equivalent to setting the median of U equal to L0 � G(t0, 1=2).
Recall that IY denotes the support of Y .
ASSUMPTION 7
1. TL is a bounded interval in IY and t0 2 TL.
2. WL is a real bounded function on R2, and WL( � , t) integrates to 1 for all t 2 TL.
3. If WL(q, t) 6¼ 0 then z(G(t, q)) > cz and z(G(t0, q)) > cz for some cz > 0.
4. WL( � , t) vanishes outside [q0, q1] with 0 < q0 < q1 < 1 for all t 2 TL,
5. WL satisfies the Lipschitz conditions jWL(q, t) �WL(q�, t)j � cW jq� q�j and
jWL(q, t) �WL(q, t�)j � cW jt � t�j for all q, q� 2 [0, 1] and all t, t� 2 TL, where cW is
some constant.
Assumption 8 concerns the estimation set TC and the weight function WC.
ASSUMPTION 8
1. TC is a bounded interval in R.
2. WC is a real bounded function on R2, and WC( � , u) integrates to 1 for all u 2 TC.
384 T. GØRGENS
Dow
nloa
ded
by [
The
UC
Irv
ine
Lib
rari
es]
at 0
2:15
03
Nov
embe
r 20
14
3. There is a constant cz > 0 such that if u 2 TC and WC( y, u) 6¼ 0, then y 2 TL and
z(L( y) � u) > cz.
4. WC satisfies the Lipschitz conditions jWC( y, u) �WC( y�, u)j � cW jy� y�j and
jWC( y, u) �WC( y, u�)j � cW ju� u�j for all y, y� 2 TL and all u, u� 2 TC, where cW is
some constant.
Additional notation is required before the theorem can be stated. Define
f I(t, z, y, m) ¼1( y � t) � F(t j z)
z(z)(22)
and
f II(t, z, y, m) ¼1 � F(t j z)
z(z)
1( y � t)m
�FF2( y j z)�
ðt0
1( y � v)D1F1(v j z)
�FF2(v j z)2dv
� �: (23)
If the data are uncensored or type I censored and nonnegative, then f I ¼ f II.
Let wL be the set of all bounded, real functions on TL and equip wL with the metric gen-
erated by the uniform norm and the s-algebra generated by closed balls. Let LIn and LII
n
denote the new estimators for case I and case II.
THEOREM 1 For j ¼ I , II define
F j
L(t, x, y, m) ¼WL(F(t j b0x), t)f j(t, b0x, y, m) �WL(F(t0 j b0x), t)f j(t0, b0x, y, m)
� O(x, y, m)0ð1
0
WL(q, t)(E( ~XX j Z ¼ G(q, t)) � E( ~XX jZ ¼ G(q, t0))) dq:
(24)
Under assumptions 1–7,
(a) Ljn can be uniformly approximated by the empirical process PnF
j
L, i.e.,
supt2TL
jLjn(t) � L(t) � PnF
j
L(t)j ¼ op(n�1=2), (25)
and PFj
L(t) ¼ 0 for all t 2 TL.
(b) The sequence { supTLjLj
n � Lj} converges to 0 in probability.
(c) The sequence {n1=2(Ljn � L)} of random elements of wL converges in distribution to a
Gaussian stochastic process on TL. The mean of the limiting process is zero and the
covariance function is PF j
L(t)F j
L(t�) for t, t� 2 TL.
Given the first conclusion of the theorem, the second and the third follow from standard
theorems on convergence of empirical processes. Specifically, Theorem 1(b) follows from
theorem II.24 of Pollard (1984) and Theorem 1(c) from lemma 2.16 of Pakes and Pollard
(1989) and theorem VII.21 of Pollard (1984). The proof of Theorem 1(a) employs empirical
process and U -process theory and can be found in the Appendix.
A similar theorem holds for Cn. Let wC be the set of all bounded, real functions on TC and
equip wC with the metric generated by the uniform norm and the s-algebra generated by
closed balls. Let CIn and CII
n be the new estimators for case I and case II.
CENSORED TRANSFORMATION MODEL ESTIMATION 385
Dow
nloa
ded
by [
The
UC
Irv
ine
Lib
rari
es]
at 0
2:15
03
Nov
embe
r 20
14
THEOREM 2 For j ¼ I , II define
F jC(t, x, y, m) ¼WC(L�1(b0xþ t), t)f j(L�1(b0xþ t), b0x, y, m)DL�1(b0xþ t)
� O(x, y, m)0ðIL
WC(v, t)D2F(v j L(v) � t)E( ~XX j Z ¼ L(v) � t) dv
þ
ðIL
WC(v, t)D2F(v j L(v) � t)F j
L(v, x, y, m) dv: (26)
Under Assumptions 1–8,
(a) C jn can be uniformly approximated by the empirical process PnF
jC, i.e.,
supt2TC
jC jn(t) �C(t) � PnF
jC(t)j ¼ op(n
�1=2), (27)
and PF jC(t) ¼ 0 for all t 2 TC.
(b) The sequence { supTCjCj
n �Cj} converges to 0 in probability.
(c) The sequence {n1=2(Cjn �C)} of random elements of wC converges in distribution to a
Gaussian stochastic process on TC. The mean of the limiting process is zero and the
covariance function is PF jC(t)F j
C(t�) for t, t� 2 TC.
The proof of Theorem 2(a) can be found in the Appendix. As for Theorem 1, Theorems 2(b)
and 2(c) follow from Theorem 2(a) and the work of Pollard (1984) and Pakes and Pollard
(1989).
Horowitz (1996) and Gørgens and Horowitz (1999) proved similar theorems under the same
kind of assumptions, but since their estimators are based on derivatives of F stronger smooth-
ness assumptions are required. For example, Horowitz assumed existence and boundedness
of the third order derivative of L, the seventh order derivatives of z and x, and the ninth
order derivative of C. In addition, his weight function W �L must be seven times differentiable.
4 MONTE CARLO RESULTS
In this section the new estimators are compared to previous semiparametric estimators in a set
of Monte Carlo experiments using the same designs as Horowitz (1996) for uncensored data
and Gørgens and Horowitz (1999) for data with 20% random censoring. The estimators are
evaluated by their ability to predict Y � given X ¼ x. As pointed out by Horowitz, the usual
predictor, the conditional expectation of Y � given X ¼ x, is not available because L and Care not estimated on their entire domains. Instead Horowitz suggested using the median, or
some other quantile, of the conditional distribution of Y � given X ¼ x. The conditional
median of Y � given X ¼ x is
Q(x) ¼ L�1 b0xþC�1 1
2
� �� �: (28)
Horowitz’s estimator consists of replacing the unknown b, L, and C by their estimators, and
he proved (under regularity conditions) that the resulting estimator is uniformly consistent
and asymptotically normal for x 2 TQ, where TQ in a suitable bounded subset of Rr. His
theorem continues to hold if his estimators of L and C are replaced with the estimators
developed in this paper.
Details of the designs are reported in the first part of Table I. The results for the previous
estimators in the second part of Table I are quoted directly from the respective papers. Please
refer to the original sources for details concerning the specification of two weight functions,
386 T. GØRGENS
Dow
nloa
ded
by [
The
UC
Irv
ine
Lib
rari
es]
at 0
2:15
03
Nov
embe
r 20
14
two kernels, and two bandwidths. In all experiments the choice of weight functions and ker-
nels were ad hoc, whereas the bandwidths were chosen to minimize the mean integrated
square error (MISE) of Qn(x) for x 2 TQ.
For the new estimators WL, WC, Kz, and kzn are as follows. The support of WL( � , t) is the
interval [at, bt] where at ¼ max (C(L(t) � 3), C(L(t0) � 3)) and bt ¼ min (C(L(t) � 3),
C(L(t0) � 3)). This ensures that WL(F(tjz), t) and WL(F(t0jz), t) are both zero for z outside
[�3, 3]. The constant t0 is 0 in the Linear and the Sinh experiments, 1 in the Log and the
Weibull experiments, and 5 in the U-Shaped Hazard experiment. Given at and bt,
WL( � , t) is defined by
WL(q, t) ¼ K2q� (at þ bt)
bt � at
� �2
bt � at, (29)
where K is a uniform kernel on [ � 1, 1] in the Linear, the Log, and Sinh experiments, and K
is the second-order kernel
K(v) ¼15
16(1 � v2)21( � 1 � v � 1) (30)
in the Weibull and the U-Shaped Hazard experiments. Similarly, the support of WC( � , t) is
[at, bt] where at ¼ max (L�1( � 3 þ t þ L(t0) � L0), inf TL) and bt ¼ min (L�1(3 þ tþ
L(t0) � L0), sup TL), and on its support WC( � , t) is defined in the same manner as
WL( � , t). The kernel function is a fourth order kernel taken from Muller (1984),
Kz(z) ¼105
64(1 � 5z2 þ 7z4 � 3z6)1(jzj � 1): (31)
Finally, the bandwidth is 1.5, since this value approximately minimizes the MISE of Qn(x) for
x 2 TQ.
The second part of Table I shows that the new estimators perform better than the previous,
since the MISEs of the conditional median estimate are lower for the new estimators in all
TABLE I Comparison of Previous and New Estimators.
Experiment (Horowitz (1996)) Experiment (Gørgens and Horowitz (1999))
Linear Log Sinh Weibull U-shaped Hazard
DesignL(t) t þ mU ln (t) þ mU (1=13) sinh (2t) þ mU 2 ln (t) ln (0:6(t=5)13 þ 0:4(t=5)6)C(t) F(t � mU ) F(t � mU ) F(t � mU ) 1=(1 þ e�t) 1=(1 þ e�t)mU 2 2 2.0992 0 0Censoring 0% 0% 0% 20% 20%TL [�2, 2] [e�2, e2] [�2, 2] [0.2, 2.4] [0.2, 8.0]TC [0, 4] [0, 4] [0, 4] [�3, 1] [�2, 1.5]TQ [�2, 2] [�2, 2] [�2, 2] [�3, 1.5] [�3, 2.5]Sample size 100 100 100 400 400Samples 100 100 100 500 500
MISE of Qn on TQPrevious 0.065 0.517 0.131 0.012 0.233New 0.054 0.369 0.077 0.008 0.169
Note: F denotes the standard normal distribution function. In all experiments, X�N(0, 1), U�N(mU, 1), C�N(mC, 1), where mC ischosen to achieve the desired probability of censoring, and b is assumed known and equal to 1. MISE: mean integrated squared errorof estimates of the conditional median of Y* given X¼ x over x 2 TQ divided by the length of TQ.
CENSORED TRANSFORMATION MODEL ESTIMATION 387
Dow
nloa
ded
by [
The
UC
Irv
ine
Lib
rari
es]
at 0
2:15
03
Nov
embe
r 20
14
experiments. Of course, this does not prove that the new estimators will generate
predictions with lower MISE in other settings, nor does it imply that they perform better
than the previous if other performance measures are used. However, it does make the new
estimators an interesting and promising alternative to previous estimators.
5 CONCLUSION
New semiparametric estimators have been proposed for estimating L and C in a regression
model with an unknown transformation of the dependent variable. Estimators have been pro-
posed for both uncensored and right-censored data. The new estimators require only one ker-
nel function and one bandwidth and are therefore simpler to implement than the
semiparametric estimators proposed by Horowitz (1996) and by Gørgens and Horowitz
(1999), which require two of each. In addition, the new estimators perform better than the
previous semiparametric estimators in a small set of Monte Carlo experiments, where the
mean integrated square prediction errors were lower for the new estimators in all of five dif-
ferent designs. Further research is needed to determine in which situations the new estimators
are preferable to previous estimators and vice versa.
Acknowledgements
I thank Catherine de Fontenay and David Prentice for helpful comments on an earlier version
of the paper.
References
Ai, C. (1997). A semiparametric maximum likelihood estimator. Econometrica, 65(4), 933–963.Breslow, N. and Crowley, J. (1974). A large sample study of the life table and product limit estimates under random
censorship. Annals of Statistics, 2(3), 437–453.Gill, R. D. and Johansen, S. (1990). A survey of product-integration with a view towards application in survival
analysis. Annals of Statistics, 18(4), 1501–1555.Gørgens, T. (1999). Semiparametric estimation of single-index transition intensities. Discussion paper 99=25,
Institute of Economics, University of Copenhagen.Gørgens, T. and Horowitz, J. (1999). Semiparametric estimation of a censored regression model with an unknown
transformation of the dependent variable. Journal of Econometrics, 90(2), 155–191.Han, A. K. (1987). Non-parametric analysis of a generalized regression model. Journal of Econometrics, 35, 303–316.Hardle, W., Janssen, P. and Serfling, R. (1988). Strong uniform consistency rates for estimators of conditional
functionals. Annals of Statistics, 16(4), 1428–1449.Hardle, W. and Stoker, T. M. (1989). Investigating smooth multiple regression by the method of average derivatives.
Journal of the American Statistical Association, 84, 986–995.Heckman, J. J. and Singer, B. (1984). A method for minimizing the impact of distributional assumptions in
econometric models for duration data. Econometrica, 52, 271–320.Horowitz, J. L. (1996). Semiparametric estimation of a regression model with an unknown transformation of the
dependent variable. Econometrica, 64(1), 103–137.Horowitz, J. L. and Hardle, W. (1996). Direct semiparametric estimation of single-index models with discrete
covariates. Journal of the American Statistical Association, 91(436), 1632–1640.Ichimura, H. (1993). Semiparametric least squares (SLS) and weighted SLS estimation of single index models.
Journal of Econometrics, 58, 71–120.Kalbfleisch, J. D. and Prentice, R. L. (1980). The Statistical Analysis of Failure Time Data. Wiley, New York.Muller, H.-G. (1984). Smooth optimum kernel estimators of densities, regression curves and modes. Annals of
Statistics, 12, 766–774.Murphy, S. A. (1994). Asymptotic theory for the frailty model. Annals of Statistics, 23, 182–198.Murphy, S. A. (1995). Consistency in a proportional hazards model incorporating a random effect. Annals of
Statistics, 22, 712–731.Nielsen, G. G., Gill, R. D., Andersen, P. K. and Sørensen, T. I. A. (1992). A counting process approach to maximum
likelihood estimation in frailty models. Scandinavian Journal of Statistics, 19, 25–43.
388 T. GØRGENS
Dow
nloa
ded
by [
The
UC
Irv
ine
Lib
rari
es]
at 0
2:15
03
Nov
embe
r 20
14
Pakes, A. and Pollard, D. (1989). Simulation and the asymptotics of optimization estimators. Econometrica, 57, 1027–1057.
Pollard, D. (1984). Convergence of Stochastic Processes. Springer-Verlag, New York.Powell, J. L., Stock, J. H. and Stoker, T. M. (1989). Semiparametric estimation of index coefficients. Econometrica,
57(6), 1403–1430.Sherman, R. P. (1993). The limiting distribution of the maximum rank correlation estimator. Econometrica, 61(1),
123–137.Stone, C. J. (1980). Optimal rates of convergence for nonparametric estimators. Annals of Statistics, 8(6), 1348–1360.
APPENDIX: PROOF OF THEOREMS
Theorems l(a) and 2(a) can be proved using similar methods as Horowitz (1996) and Gørgens
and Horowitz (1999). Essentially, the idea is to linearize (in Kz) the estimators using the mean
value theorem or a Taylor expansion, and then apply empirical process and U-process theory
to show that the remainder terms vanish sufficiently quickly. This appendix contains outlines
of the proofs of Theorems l(a) and 2(a), concentrating on linearizing the estimators.
Convergence of the remainder terms follows from the lemmas of Gørgens and Horowitz
(1999) and Gørgens (1999) and references are given where appropriate.
Convergence of FIn, Hn, and FII
n
In this section let {zn} represent a sequence converging to z at the rate n1=2. In the proof of
Theorem 1 zn simply equals z, whereas in the proof of Theorem 2 z and zn will be replaced by
L(v) � t and Ln(v) � t.
Define S1 ¼ {z 2 R: z(z) > cz} and S2 ¼ TL � S1, then S1 and S2 are bounded and edge-
effects in the kernel estimates disappear asymptotically on S1 and S2, because by Assumption
4 z(zþ e), F( y j zþ e), F1( y j zþ e), and �FF2( y j zþ e) are smooth for n large whenever z 2
S1 or ( y, z) 2 S2 and e < kzn. By lemma 5 of Gørgens and Horowitz (1999), the assumptions
given in Section 3 imply
supz2S1
jzn(zn) � z(z)j ¼ op(n�1=4) (32)
and
sup(t,z)2S2
jAjn(t, zn) � Aj(t, z)j ¼ op(n�1=4), j ¼ 0, 1, (33)
where Aj(t, z) ¼ Fj(t j z)z(z). Uniform convergence of �AA2n follows from uniform convergence
of A0n.
As a matter of algebra,
F In(t j zn) � F(t j z) ¼ EI
n(t, z, zn) þ rI1n (t, z, zn), (34)
where
EIn(t, z, zn) ¼
1
n
Xni¼1
f I(t, z, Yi, Mi)Kni(zn) (35)
and
rIn(t, z, zn) ¼
F(t j z)(zn(zn) � z(z))2
z(z)zn(zn)�
(A0n(t, zn) � A0(t, z))(zn(zn) � z(z))
z(z)zn(zn): (36)
CENSORED TRANSFORMATION MODEL ESTIMATION 389
Dow
nloa
ded
by [
The
UC
Irv
ine
Lib
rari
es]
at 0
2:15
03
Nov
embe
r 20
14
Therefore,
sup(t,z)2S2
jF In(t j zn) � F(t j z)j ¼ op(n�1=4) (37)
and
sup(t,z)2S2
jF In(tjzn) � F(t j z) � EI
n(t, z, zn)j ¼ op(n�1=2) (38)
follow immediately from (32) and (33).
Following Breslow and Crowley (1974, p. 447), Hn � H has the expansion
Hn(t j zn) � H(t j z) ¼ EHn (t, z, zn) � rH1n(t, z, zn) þ rH2n(t, z, zn), (39)
where
EHn (t, z, zn) ¼
1
n
Xni¼1
1(Yi � t)Mi
�AA2(Yi, z)�
ðt0
1(Yi � v)A1(dv, v)
�AA2(v, z)2
� �Kni(zn), (40)
rH1n(t, z, zn) ¼
ðt0
( �AA2n(v, zn) � �AA2(v, z))(A1n(dv, zn) � A1(dv, z))
�AA2(v, z)2(41)
and
rH2n(t, z, zn) ¼
ðt0
( �AA2n(v, zn) � �AA2(v, z))2A1n(dv, zn)
�AA2n(v, zn) �AA2(v, z)2: (42)
Under the assumptions given in Section 3, lemma 5 of Gørgens and Horowitz (1999)
implies sup(t,z)2S2jEH
n (t, z, zn)j ¼ op(n�1=4), lemma 2 of Gørgens (1999) implies
sup(t, z)2S2j rH1n(t, z, zn)j ¼ op(n�1=2), and sup(t,z)2S2
j rH2n(t, z, zn)j ¼ op(n�1=2) follows from
Eq. (33). It follows that
sup(t,z)2S2
jHn(tjzn) � H(tjz)j ¼ op(n�1=4) (43)
and
sup(t,z)2S2
jHn(t j zn) � H(t j z) � EHn (t, z, zn)j ¼ op(n
�1=2): (44)
Assumption 4 implies that H is continuous, so F ¼ 1 � exp ( � H) by Eq. (16). Hence
F IIn (t j zn) � F(t j z) ¼ � exp ( ln (1 � F II
n (t j zn))) þ exp ( � H(t j z))
¼ exp ( � H(t j z))(Hn(t j zn) � H(t j z))
�1
2
� �exp ( � H�
n (t j zn))(Hn(t j zn) � H(t j z))2
� exp ( � H��n (t j zn))( ln (1 � F II
n (t j zn)) þ Hn(t j zn)), (45)
where H�n (t j zn) is between Hn(t j zn) and H(t j z), and H��
n (t j zn) is between
� ln (1 � jF IIn (t j zn)) and Hn(t j zn). An argument similar to the one given by Breslow
and Crowley (1974) for their lemma 1 shows that
sup(t,z)2S2
j ln (1 � F IIn (t j zn)) þ Hn(t j zn)j ¼ op(n
�1=2): (46)
390 T. GØRGENS
Dow
nloa
ded
by [
The
UC
Irv
ine
Lib
rari
es]
at 0
2:15
03
Nov
embe
r 20
14
It then follows from (43), (45), and (46) that
sup(t,z)2S2
jF IIn (t j zn) � F(t j z)j ¼ op(n
�1=4): (47)
Moreover,
F IIn (t j zn) � F(t j z) ¼ EII
n (t, z, zn) þ rIIn (t, z, zn), (48)
where
EIIn (t, z, zn) ¼
1
n
Xni¼1
f II(t, z, Yi, Mi)Kni(zn) (49)
and
rIIn (t, z, zn) ¼ � exp ( � H(t j z))(rH1n(t, z, zn) � rH2n(t, z, zn))
�1
2
� �exp ( � H�
n (t j zn))(Hn(t j zn) � H(t j z))2
� exp ( � H��n (t j zn))( ln (1 � F II
n (t j zn)) þ Hn(t j zn)): (50)
Therefore,
sup(t,z)2S2
jFIIn (t j zn) � F(t j z) � EII
n (t, z, zn)j ¼ op(n�1=2) (51)
follows from Eqs. (43), (44), and (46).
Proof of Theorem 1(a)
For j ¼ I, II define
fj
L(t, z, y, m) ¼ WL(F(t j z), t)f j(t, z, y, m) �WL(F(t0 j z), t) f j(t0, z, y, m): (52)
In the following the superscript j is suppressed. Under the assumptions given in Section 3,
with a minor modification lemma 10 in appendix A.2 of Gørgens and Horowitz (1999)
implies that
supt2TL
PnFL(t) �1
n
Xni¼1
ð1�1
fL(t, z, Yi, Mi)Kni(z) dz
���������� ¼ op(n
�1=2): (53)
To prove Theorem 1(a) it remains to be shown that
supt2TL
Ln(t) � L(t) �1
n
Xni¼1
ð1�1
fL(t, z, Yi, Mi)Kni(z) dz
���������� ¼ op(n�1=2): (54)
Define
Ln(t, t�) ¼
ð1
0
WL(q, t)(Gn(t�, q) � G(t�, q)) dq, (55)
then Ln(t) � L(t) ¼ Ln(t, t) � Ln(t, t0). By definition of fL and En
1
n
Xni¼1
fL(z, Yi, Mi, t)Kni(z) ¼ WL(F(t j z), t)En(t, z, z) �WL(F(t0 j z), t)En(t0, z, z): (56)
CENSORED TRANSFORMATION MODEL ESTIMATION 391
Dow
nloa
ded
by [
The
UC
Irv
ine
Lib
rari
es]
at 0
2:15
03
Nov
embe
r 20
14
Therefore, (54) follows if
sup(t,t�)2TL�TL
Ln(t, t�) �
ð1�1
WL(F(t� j z), t)En(t�, z, z) dz
�������� ¼ op(n�1=2): (57)
Define J (q, t) ¼Ð q
0WL(V , t) dv. By integration by parts (using the assumption that WL( � , t)
vanishes outside [q0, q1]) and a change of variables,
Ln(t, t�) ¼
ð1�1
(J (Fn(t� j z), t) � J (F(t� j z), t)) dz: (58)
(This result is similar to Hardle et al., 1988, p. 1438.) Now the mean value theorem implies
Ln(t, t�) ¼
ð1�1
WL(F(t� j z), t)(Fn(t� j z) � F(t� j z)) dzþ R1n(t, t
�), (59)
where
R1n(t, t�) ¼
ð1�1
(WL(F�n (t� j z), t) �WL(F(t� j z), t))(Fn(t
� j z) � F(t� j z)) dz (60)
and F�n (t� j z) is between Fn(t
� j z) and F(t� j z). Since D2WL is bounded by cW, say, and
WL( � , t) has bounded support,ð1�1
jD1WL(F�n (t� j z), t)j dz � cW (G(t, q1) � G(t, q0) þ 2e) (61)
for all sufficiently large n and any e > 0. By Lipschitz continuity of WL( � , t) and Eq. (37) or
(47), it therefore follows that sup(t,t�)2TL�TLjR1n(t, t�)j ¼ op(n�1=2). Substituting from (34) or
(48) gives
Ln(t, t�) ¼
ð1�1
WL(F(t�jz), t)En(t�, z, z) dzþ R1n(t, t�) þ R2n(t, t�), (62)
where
R2n(t, t�) ¼
ð1�1
WL(F(t�jz), t)rn(t, z, z) dz: (63)
Since WL is bounded, WL( � , t) has bounded support, and since WL(F(L�jz), t) 6¼ 0 implies
z(z) > cz, it follows that sup(t,t�)2TL�TLjR2n(t, t�)j ¼ op(n�1=2), and Eq. (57) is proved.
Proof of Theorem 2(a)
The proof of Theorem 2(a) is simpler than Theorem l(a), because the weight function only
appears once in the formula for Cn and because the dependence of Cn on Ln is already
taken into account in lemma 10 of Gørgens and Horowitz (1999). For j ¼ I, II define
fjC(t, v, y, m) ¼ WC(v, t)f j(v, L(v) � t, y, m): (64)
Suppressing the superscript j indicating estimator I or II, lemma 10 in appendix A.2 of
Gørgens and Horowitz (1999) implies
supt2TC
PnFC �1
n
Xni¼1
ðTL
fC(t, v, Yi, Mi)Kni(Ln(v) � t) dv
���������� ¼ op(n
�1=2): (65)
392 T. GØRGENS
Dow
nloa
ded
by [
The
UC
Irv
ine
Lib
rari
es]
at 0
2:15
03
Nov
embe
r 20
14
Therefore, it remains to be shown that
supt2TC
Cn(t) �C(t) �1
n
Xni¼1
ðTL
fC(t, v, Yi, Mi)Kni(Ln(v) � t) dv
���������� ¼ op(n
�1=2): (66)
By the approximation (34) or (48),
Cn(t) �C(t) ¼
ðTL
WC(v, t)(Fn(vjLn(v) � t) � F(vjL(v) � t)) dv
¼1
n
Xni¼1
ðTL
fC(t, v, Yi, Mi)Kni(Ln(v) � t) dvþ R3n(t), (67)
where
R3n(t) ¼
ðTL
WC(v, t)rn(v, L(v) � t, Ln(v) � t) dv: (68)
Since the support of WC( � , t) is bounded, uniform convergence of rn (Eq. (38) or (51))
implies supt2TCjR3n(t)j ¼ op(n�1=2). Eq. (66) and hence Theorem 2(a) follows.
CENSORED TRANSFORMATION MODEL ESTIMATION 393
Dow
nloa
ded
by [
The
UC
Irv
ine
Lib
rari
es]
at 0
2:15
03
Nov
embe
r 20
14