Upload
others
View
18
Download
0
Embed Size (px)
Citation preview
Detecting Weak Identification by Bootstrap∗
Zhaoguo Zhan†
June 25, 2013
Abstract
This paper proposes using bootstrap resampling as a diagnostic tool to detect
weak instruments in the instrumental variable regression. When instruments are
not weak, the bootstrap distribution of the two-stage-least-squares estimator is
close to the normal distribution. As a result, substantial difference between
these two distributions indicates the strength of identification is weak. We use
the Kolmogorov-Smirnov statistic to measure the difference, and suggest a cutoff
for detecting weak instruments. Monte Carlo simulation results show that the
proposed method is comparable to the popular F > 10 rule of thumb. Since
the bootstrap is applicable in the instrumental variable regression as well as in
the generalized method of moments, the proposed method can be viewed as a
universal graphic tool for detecting weak identification.
JEL Classification: C18, C26
Keywords: Weak instruments, weak identification, bootstrap
1 Introduction
The Instrumental Variable (IV) regression and Generalized Method of Moments (GMM)
are becoming the standard toolkit for empirical economists, and it is now well known
∗I am very thankful to Frank Kleibergen, Sophocles Mavroeidis, Blaise Melly and participants in
seminar talks at Brown University for their valuable comments. The Monte Carlo study in this paper
was supported by Brown University through the use of the facilities of its Center for Computation
and Visualization.†Mail: School of Economics and Management, Tsinghua University, Beijing, China, 100084. Email:
[email protected]. Phone: (+86)10-62789422. Fax: (+86)10-62785562.
1
that both IV and GMM applications may suffer from the problem of weak identifi-
cation. A leading example that has received sizable attention is the linear IV regres-
sion with weak instruments studied in Nelson and Startz (1990), Bound et al. (1995),
Staiger and Stock (1997), Stock and Yogo (2005), etc. When instruments are weak,
the finite sample distribution of the commonly used two-stage-least-squares (2SLS) es-
timator tends to be poorly approximated by the limiting normal distribution based on
the first-order asymptotic theory, which further induces substantial size distortion of
the conventional t/Wald test that relies on the property of asymptotic normality. See
the article by Stock et al. (2002) for a survey.
Two approaches co-exist in the literature to handle weak identification: the first ap-
proach is to use the robust methods in Anderson and Rubin (1949), Stock and Wright
(2000), Kleibergen (2002)(2005), Moreira (2003), etc., which can produce confidence
intervals/sets with the correct coverage for the parameters of interest, regardless of
the strength of identification; the second approach is to rule out weak identification in
IV/GMM applications by pretesting, see, e.g., the F test in Staiger and Stock (1997)
for detecting weak instruments in the linear IV model, and the various methods in
Wright (2003), Bravo et al. (2009), Inoue and Rossi (2011) for detecting weak identi-
fication in the GMM framework.
Although robust methods are naturally appealing, it is still common practice that
empirical economists adopt pretests to exclude weak identification in IV/GMM appli-
cations, and then proceed to apply the conventional inference methods like the 2SLS
estimator and t/Wald test. The reason is partially due to the fact that, other than
confidence intervals/sets, the point estimators for the parameters of interest are usu-
ally preferred by policy makers, while conventional point estimators are generally not
meaningful unless weak identification is excluded. In addition, if identification is not
weak, then the rich set of conventional methods is applicable, making statistical infer-
ence and economic decisions much easier. As a result, excluding weak identification by
pretesting still has its practical importance, and a group of tests have been proposed
by the growing literature on identification, which this paper builds upon. In the linear
IV regression model, Hahn and Hausman (2002) and Stock and Yogo (2005) provide
tests for the null hypothesis of strong instruments and weak instruments respectively.
The F test with the F > 10 rule of thumb proposed in Staiger and Stock (1997) is
widely used in empirical studies. In Hansen (1982)’s GMM framework, which nests the
linear IV model, recent work include, e.g., Wright (2002)(2003), Kleibergen and Paap
(2006), Bravo et al. (2009), Inoue and Rossi (2011).
This paper proposes a simple method based on bootstrap resampling to detect
2
whether weak identification exists in IV/GMM applications. Unlike the existing detec-
tion methods, which mainly provide a statistic to measure the strength of identification
in some specialized IV/GMM setup, the proposed bootstrap approach in this paper
has the unique feature of providing a graphic view of the identification strength, and
is generally applicable to both IV and GMM applications.
The idea of this paper is illustrated by Figure 1 below. The exact finite sample
Figure 1: Three related distributions
Normal Distribution
Bootstrap Distribution
Exact Distribution
distribution of IV/GMM estimators is generally unknown, but could be approximated
either by the limiting normal distribution or by the bootstrap distribution. When the
IV/GMM models are not weakly identified, both of these two approximation methods
provide reasonable approximations to the exact distribution1, where the difference be-
tween these two methods is expected to be negligible. On the other hand, in case of
weak identification, the normal approximation and the bootstrap approximation are
very different from each other. Thus, substantial departure of the bootstrap distribu-
tion from the normal distribution provides evidence for weak identification.
The idea briefly described by Figure 1 is further illustrated in Figure 2, which results
from an empirical application that will be discussed in detail later in this paper. In the
left panel of Figure 2, we present the p.d.f., c.d.f. and Q-Q plot of the bootstrapped
2SLS estimator under a weak instrument, where all three graphs indicate that when
a weak instrument is used, the bootstrap distribution of the 2SLS estimator is far
from the normal distribution. In contrast, a stronger instrument is used to draw the
right panel of Figure 2, where the p.d.f., c.d.f. and Q-Q plot of the corresponding
bootstrapped 2SLS estimator suggest that compared to the left panel, the bootstrap
distribution is closer to the normal distribution. As indicated by Figure 2, this paper
proposes that the strength of instruments can be inferred by comparing the bootstrap
distribution of the 2SLS estimator with the normal distribution.
1The bootstrap, however, may perform better because of the asymptotic refinement property. See,e.g. Horowitz (2001).
3
Figure 2: p.d.f., c.d.f. and Q-Q plot of the bootstrapped 2SLS estimator
· · · standardized bootstrap distribution — standard normal
−5 0 50
0.1
0.2
0.3
0.4
0.5
0.6
0.7
0.8
0.9
1
(a) p.d.f. weak instrument
−5 0 50
0.1
0.2
0.3
0.4
0.5
0.6
0.7
0.8
0.9
1
(b) p.d.f. strong instrument
−5 0 50
0.1
0.2
0.3
0.4
0.5
0.6
0.7
0.8
0.9
1
(c) c.d.f. weak instrument
−5 0 50
0.1
0.2
0.3
0.4
0.5
0.6
0.7
0.8
0.9
1
(d) c.d.f. strong instrument
−3 −2 −1 0 1 2 3−3
−2
−1
0
1
2
3
X: normal Y: bootstrap
(e) Q-Q weak instrument
−3 −2 −1 0 1 2 3−3
−2
−1
0
1
2
3
X: normal Y: bootstrap
(f) Q-Q strong instrument
4
In the rest of the paper, we employ a linear IV model for better illustration of
the proposed bootstrap method, although the idea of the method as described by
Figure 1 is generally applicable to the other IV/GMM settings. Section 2 evaluates
the popular F > 10 rule of thumb, and we show analytically that this rule of thumb is
a relatively low benchmark for excluding weak instruments in the linear IV regression
model with one endogenous variable. In Section 3, we propose the bootstrap-based
method for detecting weak instruments in the linear IV setup. Monte Carlo results
and an illustrative application are included in Section 4. Section 5 concludes. The
proofs are attached in the appendix.
2 Preliminary
2.1 Linear IV Model
Following Stock and Yogo (2005), we focus on the linear IV model under homoscedas-
ticity, with n i.i.d. observations:{
Y = Xθ + U
X = ZΠ+ V
U = (U1, ..., Un)′ and V = (V1, ..., Vn)
′ are n × 1 vectors of structural and reduced
form errors respectively, where (Ui, Vi)′, i = 1, ..., n, is assumed to have mean zero
with covariance matrix Σ =
(
σ2u ρσuσv
ρσuσv σ2v
)
. Z = (Z1, ..., Zn)′ is the n× k matrix
of instruments with k ≥ 1, E(ZiUi) = 0 and E(ZiVi) = 0. Y = (Y1, ..., Yn)′ and
X = (X1, ..., Xn)′ are n× 1 vectors of endogenous observations. The single structural
parameter of interest is θ, while Π is the k × 1 vector of nuisance parameters.
Note that k is the number of instruments, which is assumed to be fixed in the model
setup. Hence we do not discuss the so-called many instruments asymptotics in, e.g.,
Chao and Swanson (2005), where the number of instruments can grow as the number
of observations increases.
In addition, we have Z ′Z/np→ Qzz by the law of large numbers, and (Z
′U√n, Z′V√
n)
d→(Ψzu,Ψzv) by the central limit theorem, provided the moments exist, where
Qzz = E(ZiZ′i) is invertible
(
Ψzu
Ψzv
)
∼ N (0,Σ⊗Qzz)
5
The commonly used 2SLS estimator for θ, denoted by θn, is written as:
θn =[
X ′Z(Z ′Z)−1Z ′X]−1
X ′Z(Z ′Z)−1Z ′Y
Assumption 1 (Strong Instrument Asymptotics). Π = Π0 6= 0, and Π0 is fixed.
Under Assumption 1 and the setup of the model, the textbook result is that θn
is asymptotically normally distributed, according to first-order asymptotics (see, e.g.,
Wooldridge (2002)): √n(θn − θ)
d→ (Π′0QzzΠ0)
−1Π′0Ψzu
We slightly rewrite the asymptotic result above to standardize θn:
√nθn − θ
σ
d→ N(0, 1)
where σ2 = (Π′0QzzΠ0)
−1σ2u, and N(0, 1) stands for the standard normal variate.
2.2 Weak Identification
The sizeable literature on weak instruments highlights that in finite samples, the dif-
ference in distribution between the 2SLS estimator and the normal variate could be
substantial. See Nelson and Startz (1990), Bound et al. (1995), etc. The substantial
difference further implies the bias of the 2SLS estimator as well as the size distortion of
the conventional t/Wald test. This is often referred to as the weak instrument problem,
or more generally, the weak identification problem.
Some quantitative definitions of weak instruments are provided by Stock and Yogo
(2005), one of which states that, if the size distortion of the t/Wald test exceeds 5% (e.g.
the rejection rate under the null exceeds 10% at the 5% level), then the instruments
are considered as weak.
Motivated by Stock and Yogo (2005), we adopt the 5% rule and provide a similar
quantitative definition of weak identification.
Definition 1. The identification strength of IV/GMM applications is deemed weak if
the maximum difference between the distribution function of the standardized IV/GMM
estimator and that of the standard normal variate exceeds 5%; otherwise the identifi-
cation is deemed strong.
In other words, identification is deemed weak in this paper iff the Kolmogorov-
6
Smirnov distance below exceeds 5%:
KS = sup−∞<c<∞
∣
∣
∣
∣
∣
P (√nθn − θ
σ≤ c)− Φ(c)
∣
∣
∣
∣
∣
> 5%
Note that√n θn−θ
σis also the non-studentized t statistic, thus Definition 1 can be
viewed as the following: if the size distortion of a one-sided non-studentized t test at
any level exceeds 5%, then identification is defined as weak. In this sense, Definition 1
can be considered as an extension of Stock and Yogo (2005), since it aims to control the
size distortion at any nominal level, while Stock and Yogo (2005) set the significance
level to 5%, which does not provide sufficient flexibility.
2.3 Edgeworth
In order to illustrate the departure of√n θn−θ
σfrom normality, we employ the Edgeworth
expansion. The following result is a direct application of Theorem 2.2 in Hall (1992).
Theorem 2.1. Under Assumption 1, if the following two conditions for the (2k+k2)×1
random vector Ri = (XiZ′i, YiZ
′i, vec(ZiZ
′i)
′)′ hold:
i. E(||Ri||3) < ∞,
ii. lim sup||t||→∞|Eexp(it′Ri)| < 1,
then√n θn−θ
σadmits the two-term Edgeworth expansion uniformly in c, −∞ < c < ∞:
P (√nθn − θ
σ≤ c) = Φ(c) + n−1/2p(c)φ(c) +O(n−1)
where p(c) is a polynomial of degree 2 with coefficients depending on the moments of
Ri up to order 3; Φ(·) and φ(·) are respectively the c.d.f. and p.d.f. of the standard
normal variate.
A similar result is stated in Moreira et al. (2009), where the necessity of the two
technical conditions i and ii is explained: the first condition is the imposed mini-
mum moment assumption, while the second is the Cramer’s condition discussed in
Bhattacharya and Ghosh (1978).
Based on the Edgeworth expansion above, the leading term that affects the depar-
ture of√n θn−θ
σfrom normality is captured by n−1/2p(c)φ(c), which shrinks to zero as
n → ∞. The derivation of p(c) is feasible but messy, and its closed form is merely
7
presented in the literature. With the help of the bootstrap, however, we do not re-
quire the closed form for the purpose of this paper, which is illustrated in detail later.
Nevertheless, we show the closed form of p(c) in a simplified setup: the just identified
case.
Corollary 2.2. Under the assumptions of Theorem 2.1, if Zi is independent of Ui, Vi
and the model is just-identified, i.e. k = 1, then:
p(c) =ρ
√
Π′
0QzzΠ0
σ2v
c2
and√n θn−θ
σadmits the two-term Edgeworth expansion uniformly in c, −∞ < c < ∞:
P (√nθn − θ
σ≤ c) = Φ(c) + n−1/2 ρ
√
Π′
0QzzΠ0
σ2v
c2φ(c) +O(n−1)
Corollary 2.2 indicates that the departure of the 2SLS estimator from normality in
the just-identified model is crucially affected by two parameters: ρ, which measures the
degree of endogeneity; and nΠ′
0QzzΠ0
σ2v
, which is the population concentration parameter:
nΠ′
0QzzΠ0
σ2v
= E(Π′
0Z′ZΠ0
σ2v
) = E(µ2)
where
µ2 ≡ Π′0Z
′ZΠ0
σ2v
is the so-called concentration parameter widely used in the weak instrument literature
(see, e.g., Rothenberg (1984), Stock and Yogo (2005)). For the normal distribution to
well approximate the exact distribution of the 2SLS estimator, µ2 needs to be large; in
addition, µ2 needs to accumulate to infinity as n increases, for the standardized 2SLS
estimator to be asymptotically normally distributed.
2.4 F > 10
The first-stage F test advocated by Bound et al. (1995) and Staiger and Stock (1997)
is commonly used for detecting weak instruments in the 2SLS procedure. For the linear
8
IV model described above, the F statistic is computed by:
F =Π′
nZ′ZΠn/k
σ2v
where Πn = (Z ′Z)−1Z ′X , σ2v = (X − ZΠn)
′(X − ZΠn)/(n− k). Thus the F statistic
can be viewed as an estimator of the concentration parameter µ2 weighted by the
number of instruments k. If the F statistic is greater than the tabled critical values2
in Stock and Yogo (2005), which are typically around 10 for small k, then the null of
weak instruments quantified by Stock and Yogo (2005) is rejected: see also the F > 10
rule of thumb suggested in Staiger and Stock (1997).
We now relate Definition 1 to the concentration parameter in the setup of Corollary
2.2 to show that the F > 10 rule is not strict enough to exclude the type of weak
identification in Definition 1. We focus on the just-identified setup of Corollary 2.2,
which is relevant in empirical studies3.
For simplicity, we omit theO(n−1) term of the Edgeworth expansion as if the leading
term n−1/2p(c)φ(c) solely determines the size distortion, and then we approximately
have strong identification if for all −∞ < c < ∞:
|n−1/2p(c)φ(c)| ≤ 5% ⇐⇒ [p(c)]2 ≤ n
400φ2(c)
Plugging in the expression for p(c) in Corollary 2.2, we get:
ρ√
Π′
0QzzΠ0
σ2v
c2
2
≤ n
400φ2(c)=⇒ n
Π′0QzzΠ0
σ2v
≥ 400ρ2c4φ2(c)
At the point c = ±√2, 400ρ2c4φ2(c) achieves its maximum. When the degree of
endogeneity is severe, i.e. |ρ| ≈ 1, the maximum is slightly above 34. Consequently,
if endogeneity is severe, the population concentration parameter needs to exceed 34 to
avoid the 5% distortion at any level, as defined in Definition 1.
The argument above thus suggests that the F > 10 rule of thumb is not sufficient to
2The tabled critical values in Stock and Yogo (2002)(2005) rely on the quantitative definitions ofweak instruments, the number of endogenous variables and the number of instruments. The F > 10rule of thumb corresponds to the case of a single endogenous variable, where the benchmark of weakinstruments is set by the relative bias of 2SLS over OLS.
3It is common in empirical studies that only a single instrument is used for the single endogenousvariable.
9
rule out weak identification in Definition 1, especially when the degree of endogeneity
is severe. In the just-identified case, for example, the first-stage F statistic is an
estimator of the population concentration parameter. If F exceeds 10 but is lower
than, say 34, then it is still very likely that the population concentration parameter
is not large enough to exclude the 5% distortion when it comes to the one-sided t
test. In other words, even when F > 10 holds in empirical applications, the chance
that the distribution of the 2SLS estimator substantially differs from normality is still
large, which further suggests the size distortion of the t/Wald test is still likely to be
substantial4.
Note that the argument we adopt above is an approximation, i.e. we consider only
the leading term n−1/2p(c)φ(c) of the Edgeworth expansion. In a later part of this
paper, we present the Monte Carlo evidence which confirms the insufficiency of the
F > 10 rule for ruling out weak identification in Definition 1.
From the GMM perspective, the F test is aiming to examine the rank condition: in
the linear IV model with a single endogenous regressor, the rank condition corresponds
to that the correlation of the endogenous regressor and the instruments is non-zero.
The approach of testing the rank condition for detecting weak identification, however,
has some limitations when it is extended to the more general GMM framework. First
of all, the rank test statistic in the non-linear GMM may depend on the weakly identi-
fied parameters, which could not be consistently estimated (see, e.g., Wright (2003)).
Secondly, it is not clear how large the rank test statistic of GMM needs to be in order
to rule out weak identification, to the best of our knowledge.
To put it in another way, the popular F > 10 rule of thumb for excluding weak in-
struments does not imply the size distortion of the t/Wald test can be controlled under
5%, as illustrated by the derivation above; furthermore, the natural extension of the F
test for examining the rank condition in the GMM framework is not readily available.
Given the importance of detecting weak identification in IV/GMM applications, and
the limitations of the rank test like F , a question naturally arises: can we have an
easy-to-use detection method, which is applicable to both IV and GMM applications?
This paper suggests that the bootstrap could be the tool.
4Stock and Yogo (2002) also suggest that F needs to exceed 16.38 in the just-identified case, inorder to avoid 5% distortion of the Wald test at the 5% significance level, thus F > 10 does not implythe size distortion of t/Wald is less than 5%. Nevertheless, the F > 10 rule of thumb is still widelyused in empirical studies.
10
3 Bootstrap
Since the introduction by Efron (1979), the bootstrap has become a practical tool for
conducting statistical inference, and its properties can be explained using the theory of
the Edgeworth expansion in Hall (1992). As an alternative to the limiting distribution,
the bootstrap approximates the distribution of a targeted statistic by resampling the
data. In addition, there is considerable evidence that the bootstrap performs better
than the first-order limiting distribution in finite samples. See, e.g., Horowitz (2001).
However, the bootstrap does not always work well. When IV/GMM models are
weakly identified, for instance, the bootstrap is known to provide a poor approximation
to the exact finite sample distribution of the conventional IV/GMM estimators, as
explained in Hall and Horowitz (1996). Nevertheless, the fact that the bootstrap fails
still conveys useful information for the purpose of this paper: when instruments are
weak, the bootstrap distribution of the 2SLS estimator is far away from normality; in
contrast, when instruments are not weak, the bootstrap distribution is a good proxy
for the exact distribution of the 2SLS estimator, both of which are close to normality.
Thus we can study the departure of the bootstrap distribution from normality to detect
the strength of identification, as illustrated by Figure 1 and Figure 2 above.
3.1 Residual Bootstrap
To detect whether we suffer weak identification in Definition 1, we are interested in
estimating the Kolmogorov-Smirnov distance. Since the exact distribution of the 2SLS
estimator is unknown, we replace it with its bootstrap counterpart.
The residual bootstrap we use for the linear IV model is described as follows.
Step 1: U , V are the residuals induced by θn, Πn in the linear IV model:
U = Y −Xθn, V = X − ZΠn
Step 2: Recenter U , V to get U , V , so that they have mean zero and are orthogonal
to Z.
Step 3: Sampling the rows of (U , V ) and Z independently n times with replace-
ment, and let (U∗, V ∗) and Z∗ denote the outcome, following the naming convention
of bootstrap.
Step 4: The dependent variables (X∗, Y ∗) are generated by:
X∗ = Z∗Πn + V ∗, Y ∗ = X∗θn + U∗
11
As the counterpart of θn, the bootstrapped 2SLS estimator θ∗n is computed by the
bootstrap resample (X∗, Y ∗, Z∗):
θ∗n =[
X∗′Z∗(Z∗′Z∗)−1Z∗′X∗]−1
X∗′Z∗(Z∗′Z∗)−1Z∗′Y ∗
Under Assumption 1, θn is asymptotically normally distributed. Similarly, the boot-
strapped θ∗n has the same asymptotical normal distribution. Also, the bootstrapped
2SLS estimator admits an Edgeworth expansion, which is implied by Theorem 2.2 in
Hall (1992).
Theorem 3.1. Under the conditions in Theorem 2.1, the bootstrap version of the
non-studentized t statistic admits the two-term Edgeworth expansion uniformly in c,
−∞ < c < ∞:
P (√nθ∗n − θn
σ≤ c|Xn) = Φ(c) + n−1/2p∗(c)φ(c) +O(n−1)
where σ2 = (Π′nZ′ZnΠn)
−1 U ′Un, Xn = (X, Y, Z) denotes the sample of observations, and
p∗(c) is the bootstrap counterpart of p(c).
3.2 Estimation
With the help of the bootstrap, we can now estimate the unknown Kolmogorov-Smirnov
distance KS by KS∗ below:
KS∗ = sup−∞<c<∞
∣
∣
∣
∣
∣
P (√nθ∗n − θn
σ≤ c|Xn)− Φ(c)
∣
∣
∣
∣
∣
To compute KS∗, we need P (√n θ∗n−θn
σ≤ c|Xn), which can be derived by bootstrap-
ping as follows. Re-do the residual bootstrap procedure B times to compute {θ∗in , i =1, ..., B}, where θ∗in denotes the ith 2SLS estimator by bootstrapping. By the strong
law of large numbers, we have 1B
∑Bi=1 1(
√n θ∗in −θn
σ≤ c|Xn)
a.s→ P (√n θ∗n−θn
σ≤ c|Xn).
Since we can make B sufficiently large, P (√n θ∗n−θn
σ≤ c|Xn) is considered as given.
Consequently, KS∗ is considered as given once the data is given.
For KS∗ to be a good proxy for KS, we want the bootstrap distribution to well
approximate the exact distribution of√n θn−θ
σ. In fact, under the strong instrument
asymptotics, both KS∗ and KS shrink to zero at the speed of√n, and KS∗ estimates
KS super consistently, as stated by the theorem below.
12
Theorem 3.2. Under the conditions in Theorem 2.15:
KS = Op(n−1/2)
KS∗ = Op(n−1/2)
KS∗ −KS = Op(n−1)
The bootstrap-based KS∗ is thus the ideal proxy for KS under the strong instru-
ment asymptotics: as Theorem 3.2 states, both KS∗ and KS shrink to zero as the
sample size grows, and their difference is negligible.
Now a question arises: how may we account for the sampling distribution of KS∗
if we want to conduct statistical inference? The answer is again by bootstrapping.
The argument is simple and intuitive: just as we can use the bootstrap-based KS∗
as the proxy for KS, we can apply the same principle again to approximate KS∗ by
the bootstrap-based KS∗∗. In other words, we propose a double-bootstrap procedure
described below to construct the so-called bootstrap confidence interval (C.I.).
1. First bootstrap: compute KS∗ as explained above.
2. Second bootstrap: construct the bootstrap confidence interval of KS.
(a) Take a bootstrap resample from Xn, call it X 1n ;
(b) Repeat first bootstrap by replacing Xn with X 1n , and we get
KS∗∗1 = sup−∞<c<∞
∣
∣
∣
∣
∣
P (√nθ∗∗n − θ∗1n
σ∗1 ≤ c|X 1n)− Φ(c)
∣
∣
∣
∣
∣
where θ∗∗n is the bootstrapped 2SLS estimator computed by the resample
taken from X 1n ; θ
∗1n , σ∗1 defined in X 1
n are the counterpart of θn, σ defined in
Xn.
(c) Repeatedly take X 2n , ...,XB
n from Xn, and compute KS∗∗2, ..., KS∗∗B. As a
result, we can construct a 100(1−α)% bootstrap confidence interval of KS
by, for example, taking lower and upper α2quantiles of KS∗∗1, ..., KS∗∗B.
3.3 Weak Instrument
So far, we have shown that KS∗ is the ideal proxy for KS under strong instru-
ment asymptotics, and both KS and KS∗ will shrink to zero as the sample size
5The Op(·) orders presented in this paper are sharp rates.
13
increases. In this section, we show that under weak instruments, neither KS nor
KS∗ will shrink to zero, as both the exact distribution of the 2SLS estimator and the
bootstrap distribution are expected to be substantially different from normality under
weak instrument asymptotics. We show this result under Assumption 2 below, using
Hahn and Kuersteiner (2002)’s specification to model weak instruments.
Assumption 2. Π = Π0
nδ , where 0 < δ < ∞.
When δ = 0, Assumption 2 reduces to the classic strong instrument asymptotics in
Assumption 1. When δ = 12, Assumption 2 corresponds to the local-to-zero weak instru-
ment asymptotics in Staiger and Stock (1997). When δ = ∞, there is no identification.
In addition, when 0 < δ < 12, the identification is considered by Hahn and Kuersteiner
(2002) and Antoine and Renault (2009) as nearly weak ; when 12< δ < ∞, the iden-
tification is treated as near non-identified in Hahn and Kuersteiner (2002). Overall,
Assumption 2 corresponds to the drifting D.G.P. with asymptotically vanishing iden-
tification strength, which is weaker than the strong instruments case in Assumption
1.
Since the truncated Edgeworth expansion is known to provide a poor approximation
under weak instruments, we rewrite the standardized 2SLS estimator θn in the manner
of Rothenberg (1984):
√nθn − θ
σ=
√
nΠ′0QzzΠ0
σu(θn − θ)
=σv
σuE(µ2)
1
2 (θn − θ)
=ζu +
Svu
E(µ2)1
2
µ2
E(µ2)+ 2 ζv
E(µ2)12
+ Svv
E(µ2)
where ζu ≡ Π′Z′U
σuσvE(µ2)1
2
d→ N(0, 1), ζv ≡ Π′Z′V
σ2vE(µ
2)1
2
d→ N(0, 1), µ2
E(µ2)
p→ 1, Svu ≡V ′Z(Z′Z)−1Z′U
σuσv= Op(1), Svv ≡ V ′Z(Z′Z)−1Z′V
σ2v
= Op(1).
The rewriting above highlights the role of E(µ2) in affecting the departure of the
standardized 2SLS estimator from normality:√n θn−θ
σconverges to standard normal if
E(µ2) diverges to infinity.
Under Assumption 2, we have:
E(µ2) = n1−2δΠ′0QzzΠ0
σ2v
14
Thus E(µ2) accumulates to infinity iff 0 < δ < 12, which is consistent with what is
shown in Hahn and Kuersteiner (2002).
By the same logic as above, the bootstrap counterpart of E(µ2), denoted by E∗(µ∗2),
affects the deviation of the bootstrap distribution from normality in the same manner;
under Assumption 2, we have:
E∗(µ∗2) =
Π′nZ
′
ZΠn
σ2v∗
=[Π0
nδ + (Z ′Z)−1Z ′V ]′Z′
Z[Π0
nδ + (Z ′Z)−1Z ′V ]
V ′V /n
Thus E∗(µ∗2) similarly accumulates to infinity iff 0 < δ < 12.
As illustrated above, the concentration parameter in the weak instrument literature
crucially affects the deviation of the distribution of 2SLS from the normal distribution,
and is thus used as a measure of the identification strength. For the 2SLS estima-
tor to be asymptotically normally distributed, the concentration parameter needs to
accumulate to infinity as the sample size grows.
In the theorem below, we summarize the difference between E(µ2) and E∗(µ∗2), in
order to compare the identification strength in the sample and the bootstrap resample.
In addition, we emphasize that neither E(µ2) nor E∗(µ∗2) accumulates to infinity under
weak instruments, which implies that neither KS nor KS∗ will shrink to zero.
Theorem 3.3. Under the specification in Assumption 2:
1. If 0 ≤ δ < 12,
E(µ2) = Op(n1−2δ) → ∞
E∗(µ∗2) = Op(n
1−2δ) → ∞E∗(µ∗2)− E(µ2) = op(1)
2. If δ = 12,
E(µ2) =Π′
0QzzΠ0
σ2v
E∗(µ∗2)
d→ Π′0QzzΠ0 + 2Π′
0Ψzv +Ψ′
zvQ−1zz Ψzv
σ2v
E∗(µ∗2)− E(µ2)
d→ 2Π′0Ψzv +Ψ
′
zvQ−1zz Ψzv
σ2v
15
3. If 12< δ < ∞,
E(µ2) = op(1)
E∗(µ∗2)
d→ χ2k
E∗(µ∗2)− E(µ2)
d→ χ2k
Theorem 3.3 conveys the following message. When instruments are not very weak
(0 ≤ δ < 12), the difference between E
∗(µ∗2) and E(µ2) is negligible. Hence the iden-
tification strength in the bootstrap resample well preserves the original identification
strength in the sample. In contrast, when instruments are weak (12≤ δ < ∞), the
bootstrap does not well preserve the original identification strength, as E∗(µ∗2) asymp-
totically differs from E(µ2), which helps explain why the bootstrap may not function
well under weak instruments.
Although the bootstrap does not always preserve the original identification strength,
i.e. the difference between E∗(µ∗2) and E(µ2) is not always negligible, Theorem 3.3
shows that it does preserve the pattern of identification: under strong instruments, both
E∗(µ∗2) and E(µ2) accumulates to infinity, hence the bootstrap distribution is expected
to be close to the normal distribution; under weak instruments, neither E∗(µ∗2) nor
E(µ2) accumulates to infinity, hence the bootstrap distribution is expected to be far
from the normal distribution. As a result, if the bootstrap distribution is close to the
normal distribution, then it can be used as the evidence for strong instruments; in
contrast, if the bootstrap distribution is far from normality, then weak instruments are
likely to be in place.
Note that the mean of the asymptotic difference between E∗(µ∗2) and E(µ2) is
equal to k when 12≤ δ < ∞. Thus the bootstrap identification strength appears
slightly stronger than the original one for small k’s. In the just-identified case that
is most relevant in empirical studies, Theorem 3.3 suggests that the use of the boot-
strapped identification strength to detect weak instruments does not exaggerate the ac-
tual weak identification problem. Since we focus on the Stock and Yogo (2005) model,
where the number of instruments k is assumed to be fixed, we do not discuss large
values of k, which are typically dealt with by many instruments asymptotics in, e.g.,
Chao and Swanson (2005). In fact, under many instruments asymptotics, the 2SLS
estimator can be asymptotically normal under weak instruments, hence the proposed
bootstrap method in this paper does not apply to the many instruments setup.
16
3.4 A Simple Rule
Based on the discussion we have made above, the use of KS∗ to evaluate the identifi-
cation strength appears reasonable: under strong identification, KS∗ shrinks to zero,
as stated in Theorem 3.2; while under weak identification, KS∗ is expected to be large,
as stated in Theorem 3.3, which shows that the identification strength in the boot-
strap resample does not accumulate as the sample size increases, making the bootstrap
distribution substantially different from normality.
Associated with the point estimate KS∗, the bootstrap C.I. can be constructed
by the double-bootstrap procedure described in Section 3.2. Reporting KS∗ and its
associated C.I. would help researchers evaluate how serious the weak identification
problem might be.
For practical purposes, however, a simple rule of thumb for detecting weak identi-
fication in Definition 1 may be of more interest: if KS∗ > 0.05, then we may roughly
consider identification as weak. The reason is that, under the null of strong instru-
ments, KS∗ is super-consistent forKS. Thus for simplicity, we may ignore the sampling
distribution of KS∗, and consider KS∗ ≈ KS.
Although KS∗ > 0.05 does not necessarily imply weak identification in Definition
1 takes place, it does suggest that the identification strength is not very strong. In this
case, the robust inference methods in Anderson and Rubin (1949), Stock and Wright
(2000), Kleibergen (2002)(2005), Moreira (2003), etc., which can produce confidence
intervals/sets with the correct coverage regardless of the strength of identification, are
recommended; furthermore, the conventional point estimates from 2SLS are no longer
reliable.
To summarize, the bootstrap distribution of the 2SLS estimator together with the
simple quantitative rule of thumb that KS∗ > 0.05 can serve as a diagnostic tool for
detecting weak instruments: if instruments are weak, the bootstrap distribution of the
2SLS estimator is expected to be far from normality, and the distance can be quantified
by theKS∗ statistic; on the other hand, the bootstrap distribution is close to normality
when instruments are not weak. A large value of KS∗ thus signals the weak strength
of identification.
4 MC & Application
The Monte Carlo (MC) evidence and an illustrative application for the proposed boot-
strap approach are presented in this section.
17
4.1 MC
In the data generating process, we employ the just identified linear IV model below:{
Yi = Xi · 0 + Ui
Xi = Zi · Π+ Vi
(
Ui
Vi
)
∼ NID
((
0
0
)
,
(
1 ρ
ρ 1
))
, Zi ∼ NID(0, 1), i = 1, ..., 1000.
We set ρ = 0.99 and ρ = 0.50 for severe and medium endogeneity respectively.
Π =√
E(µ2)1000
, in order to control the strength of identification.
In Table 1, corresponding to each population concentration parameter E(µ2) in
Column 1, we report the true KS distance between the standardized 2SLS estimator
and the standard normal variate in Column 2. Columns 3-5 present the sample median,
mean and standard deviation of the bootstrap-based KS∗, respectively. Columns 6-7
present the frequency of excluding weak instruments based on two rules of thumb side
by side: the commonly used F > 10 rule and the KS∗ ≤ 0.05 rule suggested in this
paper.
Column 2 shows that as E(µ2) gets larger, KS becomes smaller. Also, KS decreases
when endogeneity is less severe. In particular, when ρ = 0.99 and E(µ2) = 35, KS ≈0.05, which is consistent with the Edgeworth expansion argument we adopt to show
the insufficiency of F > 10 for excluding weak identification in Definition 1. Together
with Column 1, Column 2 highlights that the concentration parameter is a reasonable
measure of the identification strength.
Columns 3-5 indicate that KS∗ approximates KS well, especially when E(µ2) is
large. This is consistent with the previous argument that under strong instruments,
KS∗ is the ideal proxy for KS, whose sampling distribution is negligible. When E(µ2)
is small, although its standard error is sizable, the mean and median of KS∗ remain
reasonably large.
Columns 6-7 show that the KS∗ ≤ 0.05 rule works as expected: it detects weak
identification properly, and it is stricter than the F > 10 rule when ρ = 0.99. In
contrast, when ρ = 0.50, these two rules of thumb are comparable. One difference
between the F > 10 rule and the KS∗ ≤ 0.05 rule is the following: the F test is
independent of the degree of endogeneity, hence its rejection frequency is unaffected
by ρ; in contrast, when ρ is large, the KS∗ ≤ 0.05 rule becomes much stricter than the
F test, and when ρ is small, the KS∗ ≤ 0.05 rule appears milder.
18
Figure 3 is drawn based on the numerical results we get from the Monte Carlo study,
and it illustrates the main idea of this paper again: the departure of the bootstrap
distribution from normality conveys the strength of identification. In particular, as
identification strength measured by the concentration parameter increases, the KS∗
distance shrinks to zero. When identification is strong, KS∗ approximates KS very
well, as indicated by the short confidence interval. When identification is weak, similar
to KS, KS∗ is also very large, although KS∗ does not approximate KS well, as
indicated by the wide confidence interval. Overall, the value of KS∗ conveys the
strength of identification. When endogeneity is severe (top panel), the imposed 5%
bar implies that the concentration parameter needs to exceed 34; in contrast, when
endogeneity is medium (bottom panel), the 5% bar requires a much lower value for the
concentration parameter, which is still above 10. Hence the widely used F > 10 rule
does not appear sufficient to exclude the type of weak identification in Definition 1.
4.2 Application
The Monte Carlo evidence above shows that the use of KS∗ to evaluate the strength
of identification is proper, and the simple KS∗ ≤ 0.05 rule for excluding weak identi-
fication works reasonably in the simulation studies. In this subsection, we employ an
empirical application to better illustrate the proposed bootstrap approach, in particu-
lar, its graphic feature.
The same data as in Card (1995) is used. By employing the IV approach, Card
(1995) answers the following question: what is the return to education? Or specifically,
how much more can an individual earn if he/she completes an extra year of schooling?
The dataset is ultimately taken from the National Longitudinal Survey of Young
Men between 1966-1981 with 3010 observations, and there are two variables in the
dataset that measure college proximity: nearc2 and nearc4, both of which are dummy
variables, and are equal to 1 if there is a 2-year or 4-year college in the local area
respectively. See Card (1995) for detailed description of the data. To identify return
to education, Card (1995) considers a structural wage equation as follows:
lwage = α+ θ · edu+W ′β + u
where lwage is the log of wage; edu is the years of schooling; the covariate vector
W contains the control variables; u is the error term; and edu is instrumented by
nearc2 and nearc4. Among the set of parameters (α, θ,β′), θ measuring the return to
19
education is of interest.
In the basic specification, Card (1995) uses five control variables: experience, the
square of experience, a dummy for race, a dummy for living in the south, and a
dummy for living in the standard metropolitan statistical area (SMSA). To bypass
the issue that experience is also endogenous, Davidson and MacKinnon (2010) replace
experience and the square of experience with age and the square of age. Following
Davidson and MacKinnon (2010), we use age, square of age, and the three dummy
variables as control variables. Hence edu is the only endogenous regressor. While
Davidson and MacKinnon (2010) simultaneously use more than one instrument, we
use the two instruments, nearc2 and nearc4, one by one as the single instrument for
edu to illustrate the proposed bootstrap approach.
To get started, we examine the strength of instruments by the first stage F test in
Stock and Yogo (2005): if nearc2 is used as the single IV for edu, F = 0.54; if nearc4 is
used as the single IV for edu, F = 10.52. According to the rule of thumb F > 10, these
two F statistics suggest that nearc4 is a strong instrumental variable, while nearc2
is weak. Table 2 reports the point estimate and 95% confidence interval of return to
education derived by 2SLS using nearc2 and nearc4 one by one as the instrument. In
addition, the identification-robust conditional likelihood ratio (CLR)6 test by Moreira
(2003) is also applied to construct a C.I. as a benchmark: under the weaker nearc2,
CLR produces a much wider C.I. than the one produced by inverting the t/Wald test;
under the stronger nearc4, these two C.I.’s are getting closer, but the difference still
appears substantial, which suggests that nearc4 is not very strong.
Now we use the bootstrap approach to evaluate the strength of identification. To
do so, we compute the 2SLS estimator of return to education by the residual bootstrap
2000 times. After standardization, we plot the p.d.f., c.d.f and Q-Q of the bootstrapped
2SLS estimator against the standard normal in Figure 2.
nearc2 as IV (left panel): Just by eyeballing the left panel in Figure 2, it is obvious
that the bootstrap distribution is far from normality, suggesting nearc2 is possibly a
weak instrument. In addition, KS∗ = 0.2270, which, according to the proposed rule of
thumb, tends to view nearc2 as a weak instrument. Furthermore, the 90% C.I. for KS
constructed by the double-bootstrap procedure is found to be (0.0724, 0.4227), which
contains no points below 0.05. All of these results consistently suggest nearc2 as a
weak instrument.
nearc4 as IV (right panel): The bootstrap distribution does not appear too far
6The IV model under consideration is just identified, hence CLR is equivalent to the AR test inAnderson and Rubin (1949) and K test in Kleibergen (2002).
20
from normality by eyeballing the right panel in Figure 2. However, KS∗ = 0.0578,
which is slightly above 0.05, indicating the identification strength is not very strong.
The bootstrapped C.I. is (0.0217, 0.0877), containing points above and below 0.05.
Consequently, nearc4 is deemed stronger than nearc2, but is not very strong, which is
consistent with the F test.
As illustrated by this empirical example, the bootstrap provides empirical researchers
a graphic view of the identification strength while the F test does not. By eyeballing
the p.d.f., c.d.f. or Q-Q plot of the bootstrap distribution, empirical researchers can
easily evaluate how strong or weak the identification is. If the graph or the com-
puted KS∗ shows evidence for weak identification, the robust inference methods in
Anderson and Rubin (1949), Stock and Wright (2000), Kleibergen (2002)(2005) and
Moreira (2003) should be applied. On the other hand, if the graph or the computed
KS∗ shows little evidence for weak identification, we may proceed to work with con-
ventional inference methods like 2SLS.
5 Conclusion
This paper suggests using the bootstrap as a tool for detecting weak identification
in IV/GMM applications. The main message is that the distance between the boot-
strap distribution of the conventional IV/GMM estimator (e.g., 2SLS) and the normal
distribution conveys information about the strength of identification. Empirical re-
searchers can easily detect weak identification by eyeballing the bootstrap distribution
to examine whether it is close to normality.
Besides the graphical advantage, another appealing feature of the proposed boot-
strap approach is that it can be extended to various IV/GMM settings. Bootstrap
methods are readily available to construct the bootstrap distribution of IV/GMM es-
timators, as shown in Horowitz (2001). The bootstrap distribution in the general
IV/GMM settings is similar to that in the specific IV model discussed in this paper:
under strong identification, the bootstrap distribution of the conventional IV/GMM
estimator is expected to be close to normality (see Hahn (1996)); in contrast, under
weak identification, the bootstrap distribution is expected to be far from normality
(see Hall and Horowitz (1996)). Hence it is generally feasible to rely on the graphic
bootstrap distribution to detect weak identification in IV/GMM applications.
The formal discussion of the paper is made under the linear IV model with one en-
dogenous variable and homoscedasticity, following Stock and Yogo (2005). The widely
21
used F test and the F > 10 rule of thumb in Staiger and Stock (1997) are employed
to compare with our proposed quantitative rule of thumb: the Kolmogorov-Smirnov
distance between the bootstrap distribution and the normal distribution does not ex-
ceed 0.05. Monte Carlo experiments show that the proposed bootstrap approach as
well as the quantitative rule is useful for detecting weak instruments. An illustrative
empirical application is also included.
We hope the proposed bootstrap-based diagnostic tool can help empirical researchers
easily decide whether weak identification should be concerned or not in empirical
IV/GMM applications, since this tool is intuitive, graphic and universally applicable.
22
References
Anderson, T. and Rubin, H. 1949. Estimation of the parameters of a single equation ina complete system of stochastic equations. The Annals of Mathematical Statistics,pages 46–63.
Antoine, B. and Renault, E. 2009. Efficient GMM with nearly-weak instruments. TheEconometrics Journal, 12:S135–S171.
Bhattacharya, R. and Ghosh, J. 1978. On the validity of the formal Edgeworth expan-sion. Annals of Statistics, 6(2):434–451.
Bound, J., Jaeger, D., and Baker, R. 1995. Problems with instrumental variablesestimation when the correlation between the instruments and the endogeneous ex-planatory variable is weak. Journal of the American Statistical Association, pages443–450.
Bravo, F., Escanciano, J., and Otsu, T. 2009. Testing for Identification in GMM underConditional Moment Restrictions, working Paper.
Card, D. 1995. Using geographic variation in college proximity to estimate the re-turn to schooling, Aspects of labour market behaviour: essays in honour of JohnVanderkamp. ed. L.N. Christofides, E.K. Grant, and R. Swidinsky.
Chao, J. and Swanson, N. 2005. Consistent estimation with a large number of weakinstruments. Econometrica, 73(5):1673–1692.
Davidson, R. and MacKinnon, J. 2010. Wild bootstrap tests for IV regression. Journalof Business and Economic Statistics, 28(1):128–144.
Efron, B. 1979. Bootstrap methods: another look at the jackknife. The Annals ofStatistics, pages 1–26.
Hahn, J. 1996. A note on bootstrapping generalized method of moments estimators.Econometric Theory, 12(1):187–197.
Hahn, J. and Hausman, J. 2002. A new specification test for the validity of instrumentalvariables. Econometrica, 70(1):163–189.
Hahn, J. and Kuersteiner, G. 2002. Discontinuities of weak instrument limiting distri-butions. Economics Letters, 75(3):325–331.
Hall, P. 1992. The Bootstrap and Edgeworth expansion. Springer Verlag.
Hall, P. and Horowitz, J. 1996. Bootstrap critical values for tests based on generalizedmethod of moments estimators. Econometrica, 64(4):891–916.
Hansen, L. 1982. Large sample properties of generalized method of moments estimators.Econometrica, pages 1029–1054.
Horowitz, J. L. 2001. The bootstrap. Handbook of econometrics, 5:3159–3228.
23
Inoue, A. and Rossi, B. 2011. Testing for weak identification in possibly nonlinearmodels. Journal of Econometrics, 161(2):246–261.
Kleibergen, F. 2002. Pivotal statistics for testing structural parameters in instrumentalvariables regression. Econometrica, pages 1781–1803.
Kleibergen, F. 2005. Testing parameters in GMM without assuming that they areidentified. Econometrica, pages 1103–1123.
Kleibergen, F. and Paap, R. 2006. Generalized reduced rank tests using the singularvalue decomposition. Journal of Econometrics, 133(1):97–126.
Moreira, M. 2003. A conditional likelihood ratio test for structural models. Economet-rica, pages 1027–1048.
Moreira, M., Porter, J., and Suarez, G. 2009. Bootstrap validity for the score test wheninstruments may be weak. Journal of Econometrics, 149(1):52–64.
Nelson, C. and Startz, R. 1990. Some further results on the exact small sample prop-erties of the instrumental variable estimator. Econometrica, pages 967–976.
Rothenberg, T. 1984. Approximating the distributions of econometric estimators andtest statistics. Handbook of econometrics, 2:881–935.
Staiger, D. and Stock, J. 1997. Instrumental variables regression with weak instru-ments. Econometrica, pages 557–586.
Stock, J. and Wright, J. 2000. GMM with weak identification. Econometrica, pages1055–1096.
Stock, J., Wright, J., and Yogo, M. 2002. A survey of weak instruments and weakidentification in generalized method of moments. Journal of Business and EconomicStatistics, 20(4):518–529.
Stock, J. and Yogo, M. 2002. Testing for weak instruments in linear IV regression.NBER Working Paper.
Stock, J. and Yogo, M. 2005. Testing for weak instruments in linear IV regression,Identification and Inference for Econometric Models: Essays in Honor of ThomasRothenberg. ed. DW Andrews and JH Stock, pages 80–108.
Wooldridge, J. 2002. Econometric analysis of cross section and panel data. MIT press.
Wright, J. 2002. Testing the null of Identification in GMM. International FinanceDiscussion Papers, 732.
Wright, J. 2003. Detecting lack of identification in GMM. Econometric Theory,19(02):322–330.
24
Appendix
Proof. (Corollary 2.2) Apply Theorem 2.2 in Hall (1992) to compute the following
items.
A1 = − ρ√
Π′
0QzzΠ0
σ2v
A2 =−6ρ
√
Π′
0QzzΠ0
σ2v
p(c) = −A1 −1
6A2(c
2 − 1) =ρ
√
Π′
0QzzΠ0
σ2v
c2
Proof. (Theorem 3.2)
KS∗ = sup−∞<c<∞
∣
∣
∣
∣
∣
P (√nθ∗n − θn
σ≤ c|Xn)− Φ(c)
∣
∣
∣
∣
∣
KS = sup−∞<c<∞
∣
∣
∣
∣
∣
P (√nθn − θ
σ≤ c)− Φ(c)
∣
∣
∣
∣
∣
By the triangle property:
KS∗ −KS ≤ sup−∞<c<∞
∣
∣
∣
∣
∣
P (√nθ∗n − θn
σ≤ c|Xn)− P (
√nθn − θ
σ≤ c)
∣
∣
∣
∣
∣
To complete the proof, we only need to show p∗(c) = p(c) +Op(n−1/2), since:
P (√nθ∗n − θn
σ≤ c|Xn) = Φ(c) + n−1/2p∗(c)φ(c) +O(n−1)
P (√nθn − θ
σ≤ c) = Φ(c) + n−1/2p(c)φ(c) +O(n−1)
P (√nθ∗n − θn
σ≤ c|Xn)− P (
√nθn − θ
σ≤ c) = n−1/2 [p∗(c)− p(c)]φ(c) +O(n−1)
25
For the random vector Ri, define its population mean µ and its sample mean R:
Ri = (XiZ′i, YiZ
′i, vec(ZiZ
′i)
′)′
µ = E(Ri) = (Π′Qzz,Π′Qzzθ, vec(Qzz)
′)′
R =1
n
n∑
i=1
Ri =1
n(X ′Z, Y ′Z, vec(Z ′Z)′)′
Define a function A : R2k+k2 → R that operates on a vector of the type of Ri:
A(Ri) =[XiZ
′i(ZiZ
′i)
−1ZiXi]−1
XiZ′i(ZiZ
′i)
−1ZiYi − θ
σ
A(R) =[X ′Z(Z ′Z)−1Z ′X ]
−1X ′Z(Z ′Z)−1Z ′Y − θ
σ
Note that
A(µ) = 0, and√nθn − θ
σ=
√nA(R)
For the vector Ri, put:
µi1,...,ij = E[
(Ri − µ)(i1)...(Ri − µ)(ij)]
where (Ri − µ)(ij) denotes the ij-th element in Ri − µ. Thus µi1,...,ij is the centered
moment of elements in Ri.
For the function A, put:
ai1,...,ij =∂j
∂x(i1)...∂x(ij )A(x)|x=µ
where x(ij ) is the ij-th element in x. Thus ai1,...,ij is the derivative of A taking values
at µ.
The following results follow Theorem 2.2 in Hall (1992):
p(c) = −A1 −1
6A2(c
2 − 1)
A1 =1
2
2k+k2∑
i=1
2k+k2∑
j=1
aijµij
A2 =2k+k2∑
i=1
2k+k2∑
j=1
2k+k2∑
l=1
aiajalµijl + 32k+k2∑
i=1
2k+k2∑
j=1
2k+k2∑
l=1
2k+k2∑
m=1
aiajalmµilµjm
26
Similarly for the bootstrap counterparts, we have:
p∗(c) = −A∗1 −
1
6A∗
2(c2 − 1)
A∗1 =
1
2
2k+k2∑
i=1
2k+k2∑
j=1
a∗ijµ∗ij
A∗2 =
2k+k2∑
i=1
2k+k2∑
j=1
2k+k2∑
l=1
a∗i a∗ja
∗l µ
∗ijl + 3
2k+k2∑
i=1
2k+k2∑
j=1
2k+k2∑
l=1
2k+k2∑
m=1
a∗i a∗ja
∗lmµ
∗ilµ
∗jm
where
R∗i = (X∗
i Z∗′i , Y
∗i Z
∗′i , vec(Z
∗i Z
∗′i )
′)′
µ∗ = E∗(R∗
i ) = (Π′n
Z ′Z
n, Π′
n
Z ′Z
nθn, vec(
Z ′Z
n)′)′
µ∗i1,...,ij
= E∗ [(R∗
i − µ∗)(i1)...(R∗i − µ∗)(ij)
]
a∗i1,...,ij =∂j
∂x(i1)...∂x(ij )A(x)|x=µ∗
Note that if we can show a∗i1,...,ijp→ ai1,...,ij and µ∗
i1,...,ij= µi1,...,ij + Op(n
−1/2)
respectively, then it follows that, A∗i = Ai + Op(n
−1/2), i = 1, 2, and consequently,
p∗(c) = p(c) +Op(n−1/2).
For a∗i1,...,ij :
a∗i1,...,ij =∂j
∂x(i1)...∂x(ij )A(x)|x=µ∗
Under Assumption 1, µ∗ = (Π′nZ′Zn, Π′
nZ′Znθn, vec(
Z′Zn)′)′
p→ µ = (Π′Qzz,Π′Qzzθ, vec(Qzz)
′)′;
hence by continuous mapping, a∗i1,...,ijp→ ai1,...,ij .
For µ∗i1,...,ij
, apply the central limit theorem for the sample mean, provided the
moments exist:
µ∗i1,...,ij
= E∗ [(R∗
i − µ∗)(i1)...(R∗i − µ∗)(ij)
]
=1
n
n∑
i=1
[
(R∗i − µ∗)(i1)...(R∗
i − µ∗)(ij)]
= µi1,...,ij +Op(n−1/2)
27
Table 1: Monte Carlo
ρ = 0.99KS∗ Rule of Thumb
E(µ2) KS median mean s.d. KS∗ ≤ 0.05 F > 10
1 0.159 0.151 0.212 0.126 0.0% 1.7%5 0.110 0.108 0.122 0.065 0.0% 17.9%10 0.089 0.091 0.092 0.027 0.4% 49.3%20 0.064 0.067 0.069 0.016 6.6% 90.5%30 0.053 0.055 0.057 0.011 27.8% 99.1%35 0.049 0.052 0.053 0.009 43.2% 99.8%40 0.046 0.048 0.049 0.008 57.6% 100.0%50 0.041 0.044 0.044 0.007 81.6% 100.0%60 0.037 0.040 0.041 0.006 93.8% 100.0%100 0.029 0.032 0.032 0.004 100.0% 100.0%
10000 0.003 0.009 0.009 0.003 100.0% 100.0%
ρ = 0.50KS∗ Rule of Thumb
E(µ2) KS median mean s.d. KS∗ ≤ 0.05 F > 10
1 0.098 0.100 0.169 0.128 4.7% 1.5%5 0.078 0.076 0.090 0.072 21.3% 17.8%10 0.059 0.058 0.062 0.034 38.4% 49.7%20 0.040 0.041 0.044 0.018 69.1% 90.2%30 0.032 0.033 0.035 0.013 88.8% 98.9%35 0.029 0.030 0.032 0.011 94.3% 99.7%40 0.027 0.028 0.030 0.009 96.9% 99.9%50 0.024 0.025 0.026 0.008 99.4% 100.0%60 0.022 0.023 0.024 0.006 99.9% 100.0%100 0.016 0.018 0.019 0.004 100.0% 100.0%
10000 0.002 0.008 0.009 0.003 100.0% 100.0%
Note: E(µ2) is the population concentration parameter; KS is the Kolmogorov-Smirnovdistance between the standardized 2SLS estimator and the standard normal variate; KS∗ isthe bootstrap approximation of KS; the frequencies of KS∗ ≤ 0.05 and F > 10 arereported in the last two columns; ρ measures the degree of endogeneity.
28
Figure 3: KS, KS∗, C.I. and the 5% bar
— KS · · · median of KS∗- - - C.I.
0 20 40 60 80 100
0.00.1
0.20.3
0.40.5
concentration parameter
KS
Top: ρ = 0.99 Bottom: ρ = 0.50
0 20 40 60 80 100
0.00.1
0.20.3
0.40.5
concentration parameter
KS
29
Table 2: Return to education in Card (1995)
IV nearc2 nearc4
first-stage F statistic 0.54 10.52
2SLS estimate θn 0.5079 0.0936
95% C.I. by t/Wald (-0.813, 1.828) (-0.004, 0.191)
95% C.I. by AR/CLR/K (−∞,−0.1750] ∪ [0.0867,+∞) [0.0009, 0.2550]
Note: This table presents the estimate θn and confidence interval for return to educationusing the data in Card (1995). The first stage F statistic is reported for the twoinstrumental variables, nearc2, nearc4, which are used one by one for the endogenous yearsof schooling. The included control variables are age, square of age, and dummy variables forrace, living in the south and living in SMSA.
30