Click here to load reader
Upload
cristina-rueda
View
223
Download
0
Embed Size (px)
Citation preview
Journal of Statistical Planning andInference 107 (2002) 123–131
www.elsevier.com/locate/jspi
Bootstrap adjusted estimators in a restrictedsetting
Cristina Rueda, Jos&e A. Men&endez ∗, Bonifacio SalvadorDepartamento de Estadistica e I.O. Facultad de Ciencias, Universidad de Valladolid,
47071 Valladolid, Spain
Abstract
In the context of a normal model, where the mean is constrained to a polyhedral convexcone, a new methodology has been developed for estimating a linear combination of the meancomponents. The method is based on an application of adapted parametric bootstrap proceduresto reduce the bias of the maximum likelihood estimator. The proposed method is likely to leadto estimators with low mean squared error. Simulation results which support this argument areincluded. c© 2002 Elsevier Science B.V. All rights reserved.
MSC: 62F30; 62F10; 62G09
Keywords: Bootstrap; Order restrictions; Orthant restrictions; Maximum likelihood estimation; Meansquared error
1. Introduction
We consider a restricted normal model where X = (X1; : : : ; Xk)′ Nk(�; I) and � isthe unknown parameter vector constrained to belong to a polyhedral convex cone Cin Rk .
Some cones considered in this paper are the simple order cone Cs ={�∈Rk=�16 · · ·6 �k}, the simple tree order cone Cst = {�∈Rk=�16 �j; j= 2; : : : ; k} and the positiveorthant cone O+
k ={�∈Rk=�i¿ 0; i=1; : : : ; k}. These have been widely studied becausethey appear in applications.
The main problem studied in this paper is the estimation of d′� for a <xed vector din Rk: The maximum likelihood estimator (MLE) of d′� is d′X ∗, where X ∗ is the MLEof �. It is well known (see Lee, 1988) that d′X ∗ does not always perform well since the
∗ Corresponding author. Tel.: +34-83-423000x4169; fax: +34-83-423013.E-mail address: [email protected] (J.A. Men&endez).
0378-3758/02/$ - see front matter c© 2002 Elsevier Science B.V. All rights reserved.PII: S0378 -3758(02)00247 -1
124 C. Rueda et al. / Journal of Statistical Planning and Inference 107 (2002) 123–131
inequality E�(d′X ∗−d′�)26E�(d′X −d′�)2 does not hold for some �∈C and d∈Rk:Several authors have dealt with this problem; see for example, Cohen and Sackrowitz(1970), Lee (1981), Kelly (1989), Fern&andez et al. (2000). For the special case C=O+
k ,Fern&andez et al. (2000) showed that if the inequality, E�(d′X ∗−d′�)26E�(d′X−d′�)2,holds when � = 0 and d is the central direction of O+
k , then it holds for any d∈Rkand �∈O+
k . A related result is obtained by Fern&andez et al. (1999) for circular cones.Therefore, in this paper we will focus on the estimation of c′�, where c is the central
direction of C. We will devote a major part of this paper to the simple case, C =O+k .
An alternative estimator of c′� is Z0 = max(c′X; 0), which universally dominatesc′X , (see, Rueda et al., 1997). This leads us to consider the MSE of Z0 as a referencevalue to compare other estimators. Notice, however, that Z0 uses only a part of theinformation contained in the cone of restrictions.
The aim of the paper is to propose some restricted estimators of c′� with low MSE.To de<ne these estimators a new methodology is presented, which is based on anapplication of parametric bootstrap procedures to reduce the large bias in the MLE.
The usual parametric bootstrap procedures need to be adapted to our restricted set-ting, principally because the constraints on the parameters usually make the bootstrapinconsistent. Fortunately, despite this inconsistency the bootstrap could work with spe-cial modi<cations (Geyer, 1991; Shaw and Geyer, 1997). Geyer (1995) proposed twomethods which adjust the parametric bootstrap successfully.
In this article, we consider one of the methods proposed by Geyer, called AdjustedActive Set Bootstrap, to estimate the bias and we also present another procedure basedon the application of successive modi<ed bootstraps.
We note that the procedures to be introduced are quite general, and therefore couldbe used for estimating any d′� when � belongs to any polyhedral convex cone C ⊂Rk and d∈ − Cp ∩ L⊥S (C), where L⊥S (C) is the orthogonal subspace to the linearityof C.
The bias and the MSE of c′X ∗ are given in Section 2 for the positive orthant. Theresults point out that very large values of the MSE of c′X ∗ appear when the biasis large, which can occur when � is close to the boundary of C. The new restrictedestimators based on the modi<ed bootstrap are also de<ned in that section.
We present the results of a simulation study for C = O+10 in Section 3 and for the
cones Cs and Cst, as well as for k = 10, in Section 4. The behaviour of the MSEfor the bias-reduced estimators is quite similar for the three cones. Compared to thenew estimators proposed here the MLE performed poorly for some parameter values,especially for values of � close to the boundary of the constraints cone.
Many other values of k and c were considered in our simulations. The results ob-tained for these were quite similar to the ones shown in Sections 3 and 4 and are,therefore, omitted.
2. Alternative boostrap estimators
In this section, we present two diGerent ways of reducing the bias of the MLE, c′X ∗;of c′�, for �∈O+
k and c = (1; : : : ; 1)′ the central direction of O+k . The corresponding
C. Rueda et al. / Journal of Statistical Planning and Inference 107 (2002) 123–131 125
restricted estimators are based on diGerent resampling methods which are modi<cationsof the standard parametric bootstrap methods.
Next we compute the bias and the MSE of c′X ∗ and Z0. These are helpful formotivating our alternative estimators.
Let b(�; �) = �’(�=�) + ��(�=�) − �, and b(�) = b(�; 1), where ’ and � are thedensity and the distribution function of an N (0; 1) distribution respectively.
Lemma 2.1. Let V be a random variable with an N (�; �2) distribution and letV ∗ = VI(V¿0). Then
(a) E(�;�)(V ∗ − �) = b(�; �),(b) E(�;�)(V ∗ − �)2 = �2�(�=�) − �b(�; �).
Proof.
E(�;�)(V ∗) =∫ ∞
0v
1�’(v− ��
)dv
= �∫ ∞
−u=�u’(u) du+ ��
(��
)
= �’(��
)+ ��
(��
):
Therefore (a) follows.
E(�;�)(V ∗ − �)2 =∫ ∞
0(v− �)2 1
�’(v− ��
)dv+ �2P(�;�)(V 6 0)
= �2∫ ∞
− u�u2’(u) du+ �2�
(−��
)
= �2[−��’(−��
)+ �
(��
)]+ �2�
(−��
);
where the second equality follows from the change u=�−1(�−�) and the third equalityby applying integration by parts.
Theorem 2.2. For any �∈O+k we have
(a) E�(c′X ∗ − c′�) =∑ki=1 b(�i);
(b) E�(c′X ∗ − c′�)2 =∑ki=1 [�(�i) − �ib(�i)] + 2
∑i¡j b(�i)b(�j).
Further, for any �∈Rk we have:
(c) E�(Z0 − c′�) = b(c′�;√k),
(d) E�(Z0 − c′�)2 = k�(c′�=√k) − c′�b(c′�;√k).
126 C. Rueda et al. / Journal of Statistical Planning and Inference 107 (2002) 123–131
mean value
0.0
0.1
0.2
0.3
0.4
0.0 0.5 1.0 1.5 2.0
(i)
(iii)
(ii)
Fig. 1. (i), (ii) and (iii) Contribution of any component of X ∗ to the quantities in Theorem 2.2 (a) andTheorem 2.3 (a) and (b), respectively.
Proof. The results are immediate consequences of Lemma 2.1.
Likewise, formulae to compute the bias and the MSE of d′X ∗ can be derived forany d∈O+
k .Curve (i) in Fig. 1, shows the bias of X ∗
i for �i¿ 0.We now give two alternative methods for estimating c′�. Both methods involve the
speci<c form of the cone C. Let us suppose that C = {�∈ : a′i�6 0; i∈ I}, de<neA� = {i∈ I : |a′iX ∗|6 �} for some constant � and consider C� = {�∈ : a′i� = 0; i∈ A�and a′i�6 0; i∈ I \ A�}. Geyer’s “Adjusted Active Set Bootstrap” samples from thedistribution indexed by X � (the MLE of � under C�) and <nds X
∗�B by maximizing the
bootstrap likelihood over C (see Geyer (1995), for other details on this procedure).The fact that c′X ∗ has a positive bias (Theorem 2.2(a)) leads us to de<ne a <rst biasestimator in our setting as
BG = max(0; c′X∗�B − c′X �)
which we use to de<ne the following bias-reduced estimator of c′�:
EBG = c′X ∗I(c′X ∗=c′X ) + max(0; c′X; c′X ∗ − BG)I(c′X ∗ �=c′X ):
Note that this estimator is greater than c′X but smaller than c′X ∗.There is some arbitrariness in this procedure since it depends on �. We chose the
value �0 = 0:6 for the simulations in the next section. This value was chosen as trialswith other values between 0 and 2.5 did not give better results in the standard situation{k = 10; �∈O+
k ; �= $c; $¿ 0}.Now, consider the active constraints in C for X ∗, AX ={i∈ I : a′iX ∗ = 0} and denote
by CX = {�∈ : a′i�6 0; i∈AX }. Let X ∗B1 denote the bootstrap MLE over CX , that is,
the value at which the maximum likelihood over CX is attained, after taking a bootstrapsample from an N (X ∗; I). Further, let X ∗
B2 denote a second bootstrap MLE over CX ,
C. Rueda et al. / Journal of Statistical Planning and Inference 107 (2002) 123–131 127
after bootstrapping an N (X ∗B1; I). This idea can be extended to de<ne a third bootstrap
MLE X ∗B3, etc.
Theorem 2.3. For any �∈O+k we have
(a) E�(c′X ∗B1 − c′X ∗) = b(0)
∑ki=1 �(−�i)
and as the bootstrap sample size; m; tends to in5nity(b) E�(c′X ∗
B2 − c′X ∗) → {b(b(0)) + b(0)}∑ki=1 �(−�i).
Proof. Consider only the <rst components; X1 and X ∗1 ; of X and X ∗; respectively.
Given X ∗1 we take a random sample from an N (X ∗
1 ; 1) population; denoted by X jB1;j = 1; : : : ; m. Let X j∗B1 = X jB1I(X1¿0) + max(0; X jB1)I(X160). The <rst component of X ∗
B1 isgiven by X ∗
B11 = m−1 ∑mj=1 X
j∗B1 .
(a) E�(X ∗B11) =
∫ 0−∞ E�(X ∗
B11=X1 = x)’(x) dx +∫∞
0 E�(X ∗B11=X1 = x)’(x − �1) dx.
From Lemma 2.1, the conditional means in the <rst and the second terms are givenby b(0) and b(�1)+�1, respectively. Then from Lemma 2.1(a) we have E(X ∗
B11−X ∗1 )=
b(0)�(−�1). Part (a) follows by applying the same result to the other components ofX ∗B1.(b) Given X ∗
B11, consider a random variable XB21 N (X ∗B11; 1). The <rst component
of X ∗B2 − X ∗ is X ∗
B21 − X ∗1 = (XB21 − X1)I(X1¿0) + max(0; XB21)I(X160).
For X1 = x¿ 0, the mean of XB21, X ∗B11, tends to x as m goes to in<nity. Therefore
limm→∞ E((X ∗B21 − X ∗
1 )I(X1¿0)) = 0.For X1 = x6 0, the mean of XB21, X ∗
B11, tends to E0(X ∗B11) = b(0), as m goes
to in<nity. Then from Lemma 2.1(a) and since P�(X16 0) = �(−�1), we obtainlimm→∞ E(max(0; XB21)I(X16 0)) = {b(b(0)) + b(0)}�(−�1), and therefore part (b)follows.
Fig. 1(ii) and (iii) show the contribution of any component of X ∗ to the quantitiesin Theorem 2.3(a) and (b), respectively.
Now, consider B1 and B2 de<ned as follows:
B1 = max(0; c′X ∗B1 − c′X ∗) and B2 = max(0; c′X ∗
B2 − c′X ∗):
In a similar way, another statistic based on a third bootstrap can be de<ned asB3 = max(0; c′X ∗
B3 − c′X ∗).The observed value of c′X ∗ − c′X is used to de<ne a mixed estimator of the bias
of c′X ∗, BM, as follows:
BM =
B3 if B36 c′X ∗ − c′X;B2 if B26 c′X ∗ − c′X ¡B3;
B1 if c′X ∗ − c′X ¡B2:
The issue of taking BM as a bias estimator is justi<ed by pro<les (ii) and (iii) inFig. 1 and will be supported by the simulations below. The statistic, B3 will be usedonly to improve the estimation of the bias for values of � close to the origin.
128 C. Rueda et al. / Journal of Statistical Planning and Inference 107 (2002) 123–131
As with the above de<nition of the BG-estimator, we de<ne the BM-estimator asfollows:
EBM = c′X ∗I(c′X ∗=c′X ) + max(0; c′X; c′X ∗ − BM)I(c′X ∗ �=c′X ):
Note that unlike the BG-estimator the BM-estimator is more complex but does notsuGer from arbitrariness.
3. Simulation results: the case O+k
In this section some results of the simulations performed to evaluate the bias, thevariance and principally the MSE of the BM- and BG-estimators de<ned in Section2 are presented. Resampling methods are based on 10 000 replicates with bootstrapsamples of size m = 100. Although Theorem 2.2 provides closed forms to computethe exact value of the MSEs of c′X; c′X ∗ and Z0, their estimates obtained in thesimulations are also shown in the <gures below for comparison with the MSE of BM-and BG-estimators.
Fig. 2 represents the MSE values for k = 10 and for values of � in the cen-tral direction. Fig. 3 shows the MSE when � lies on a 5-dimensional face of O+
k .
c´Xc´X*Z0BMBG
common mean value
5
6
7
8
9
10
11
12
0.0 0.4 0.8 1.2 1.6 2.0 2.4
Fig. 2. MSE under the central direction (k = 10).
C. Rueda et al. / Journal of Statistical Planning and Inference 107 (2002) 123–131 129
c´Xc´X*Z0BMBG
non-zero common mean value
5
6
7
8
9
10
11
12
0.0 1.26 2.53 3.79 5.06
Fig. 3. MSE under the central direction of a 5-dimensional face (k = 10).
Similar displays would be obtained if other faces of O+10 and values of � on them were
considered.The most important <ndings could be summed up as follows:1. No estimator is dominated by any other throughout the study.2. Overall, the estimator Z0 performs well, but it is outperformed by the estimators
EBG and EBM when the norm of the parameter is large. This fact is most noticeablein Fig. 3.
3. The MLE has a very large MSE on the boundary of the constraints, which is aconsequence of its large bias. For values of � in the central direction, far from zero, thebias of the MLE is low and its MSE is the smallest of all the estimators considered.
4. The bootstrap methodology enables us to reduce the bias of the MLE on theboundary of the constraints. Overall, the proposed estimators, EBG and EBM, performwell.
4. Application to order restrictions
In this section we present some simulation results for the case in which the mean� is constrained to a simple order and to a simple tree order. We consider thebootstrap-based estimators given in Section 2, now de<ned for these order cones.
130 C. Rueda et al. / Journal of Statistical Planning and Inference 107 (2002) 123–131
5
6
7
8
9
10
11
12
c´Xc´X*Z0BMBG
parameter value to estimate
.00 1.26 2.53 3.79 5.06
Fig. 4. MSE under the central direction of the simple order (k = 10).
c´Xc´X*Z0BMBG
parameter value to estimate
5
6
7
8
9
10
11
12
.00 1.26 2.53 3.79 5.06
Fig. 5. MSE under the central direction of the tree order (k = 10).
C. Rueda et al. / Journal of Statistical Planning and Inference 107 (2002) 123–131 131
In these situations, the MLE of �; X ∗, is obtained as the projection of X onto thecorresponding cones, using the minimum lower sets algorithm. The central direction cis de<ned in Abelson and Tukey (1963), and Robertson et al. (1988).
The bias of c′X ∗ for the order cones considered is less than the bias of c′X ∗ forthe orthant as simulation trials support. Therefore, in the de<nition of EBM, the thirdbootstrap is omitted as B3 contributes virtually nothing in the de<nition of BM. Besidesthe programming task is simpli<ed. The estimator EBG is de<ned the same as for theorthant.
Figs. 4 and 5 show the MSE of the proposed estimators when � is in the centraldirection and k = 10, for the case of the simple order and the simple tree order,respectively. The MSE pro<les are comparable to those in Fig. 2, Section 3, andcomments in that section are also relevant now. We detected only a minor anomalywith the estimator EBG in the simple order case as its MSE is slightly greater than kfor some values of � far from the origin.
Acknowledgements
We are indebted to a Guest Editor for the time spent in this paper and for usefulremarks and suggestions for improving its presentation. Further thanks are due to thereferees for their comments and valuable suggestions. Research supported by SpanishDGES Grant PB97-0475 and PAPIJCL Grant VA26=99.
References
Abelson, R.P., Tukey, J.W., 1963. EOcient utilization of non-numerical information in quantitative analysis:general theory and the case of the simple order Ann. Statist. 34, 1347–1369.
Cohen, A., Sackrowitz, H., 1970. Estimation of the last mean of a monotone sequence. Ann. Statist. 41,2021–2034.
Fern&andez, M.A., Rueda, C., Salvador, B., 1999. The loss of eOciency estimating linear functions underrestrictions. Scand. J. Statist. 26, 579–592.
Fern&andez, M.A., Rueda, C., Salvador, B., 2000. Parameter estimation under orthant restrictions. Canad.J. Statist. 28, 171–181.
Geyer, C.J., 1991. Constrained maximum likelihood exempli<ed by isotonic convex logistic regression.J. Amer. Statist. Assoc. 86, 717–724.
Geyer, C.J., 1995. Likelihood ratio tests and inequality constraints. Technical Report No. 610, University ofMinnesota.
Kelly, R.E., 1989. Stochastic reduction of loss in estimating normal means by isotonic regression. Ann.Statist. 17, 937–940.
Lee, C.C., 1981. The quadratic loss of isotonic regression under normality. Ann. Statist. 9, 686–688.Lee, C.C., 1988. The quadratic loss of order restricted estimators for treatment means with a control. Ann.
Statist. 16, 751–758.Robertson, T., Wright, F.T., Dykstra, R.L., 1988. Order Restricted Statistical Inference. Wiley, New York.Rueda, C., Salvador, B., Fern&andez, M.A., 1997. Simultaneous estimation in a restricted linear model. J.
Multivariate Anal. 61 (1), 61–66.Shaw, F.H., Geyer, C.J., 1997. Estimation and testing in constrained covariance component models.
Biometrika 84 (1), 95–102.