23
The Mundlak approach in the spatial Durbin panel data model Nicolas Debarsy CERPE, University of Namur Rempart de la vierge 8, 5000 Namur E-mail: [email protected] Abstract This paper extends the Mundlak approach to the spatial Durbin panel data model (SDM) to help the applied researcher to determine the adequacy of the random effects specification in this setup. We propose a likelihood ratio (LR) test that assesses the significance of the correlation between regressors and individual effects. Once the correlation with individual effects has been modeled through an aux- iliary regression, the random effects specification provides consistent estimators and the effect of time-constant variables can be estimated. Some Monte Carlo simulations study the properties of this proposed LR test in small samples and show that in some cases, it has a better behavior than the Hausman test. We finally illustrate the usefulness of the extended Mundlak approach by estimating a house price model where some of the price determinants are time-constant. We show that ignoring the endogeneity of regressors with respect to individual effects leads to unreliable estimated parameters while results obtained using the Mundlak approach and the fixed effects specification are similar (concerning time-varying variables), implying that correlation between regressors and individual effects is well captured. Keywords: Spatial autocorrelation, Panel data model, Random ef- fects, Mundlak approach, House price model JEL: C12; C21; C23; C52 1 Introduction Spatial autoregressive panel data models are of high interest since they al- low capturing individual, temporal and interactive heterogeneity. The first two types of heterogeneity come from individual or time characteristics and are easily dealt with using either a fixed or random effects panel data spec- ification. Interactive heterogeneity is due to differentiated feedback effects, Nicolas Debarsy is Fellow researcher of the F.R.S.-FNRS and gratefully acknowledges their financial support. 1

The Mundlak Approach in the Spatial Durbin Panel Data Model

Embed Size (px)

DESCRIPTION

The Mundlak Approach in the Spatial Durbin Panel Data Model

Citation preview

  • The Mundlak approach in the spatial Durbin

    panel data model

    Nicolas Debarsy

    CERPE, University of Namur

    Rempart de la vierge 8, 5000 Namur

    E-mail: [email protected]

    Abstract

    This paper extends the Mundlak approach to the spatial Durbinpanel data model (SDM) to help the applied researcher to determinethe adequacy of the random effects specification in this setup. Wepropose a likelihood ratio (LR) test that assesses the significance ofthe correlation between regressors and individual effects. Once thecorrelation with individual effects has been modeled through an aux-iliary regression, the random effects specification provides consistentestimators and the effect of time-constant variables can be estimated.Some Monte Carlo simulations study the properties of this proposedLR test in small samples and show that in some cases, it has a betterbehavior than the Hausman test. We finally illustrate the usefulnessof the extended Mundlak approach by estimating a house price modelwhere some of the price determinants are time-constant. We show thatignoring the endogeneity of regressors with respect to individual effectsleads to unreliable estimated parameters while results obtained usingthe Mundlak approach and the fixed effects specification are similar(concerning time-varying variables), implying that correlation betweenregressors and individual effects is well captured.Keywords: Spatial autocorrelation, Panel data model, Random ef-fects, Mundlak approach, House price modelJEL: C12; C21; C23; C52

    1 Introduction

    Spatial autoregressive panel data models are of high interest since they al-low capturing individual, temporal and interactive heterogeneity. The firsttwo types of heterogeneity come from individual or time characteristics andare easily dealt with using either a fixed or random effects panel data spec-ification. Interactive heterogeneity is due to differentiated feedback effects,

    Nicolas Debarsy is Fellow researcher of the F.R.S.-FNRS and gratefully acknowledgestheir financial support.

    1

  • originating from cross-section interactions between individuals.1 It cannot bedealt with traditional panel data methods and furthermore requires explicitmodeling of spatial autocorrelation. More precisely, interactive heterogeneityis captured by impact coefficients or elasticities computed from the reducedform of the spatial panel data model presented in (1), taking into accountthe interaction structure between individuals.

    Yt = WYt +Xt +Ut, t = 1, . . . , T (1)

    where Yt is the N-dimensional vector of the dependent variable for all indi-viduals in period t, W is the spatial weight matrix that models the inter-action scheme between individuals, Xt is the N K matrix of explanatoryvariables and Ut is the vector of errors possibly containing individual and/ortime effects.2 However, in applied work LeSage & Pace (2009) advocate theuse of the spatial Durbin model (hereafter SDM) which generalizes equation(1) in allowing the endogenous variable to depend on the neighboring valuesof explanatory variables (WXt), as shown in (2).

    3

    Yt = WYt +Xt +WXt +Ut, t = 1, . . . , T (2)

    It should be clear from (2) that the SDM simplifies to the spatial autoregres-sive model when is not significant. Also, by imposing other constraints onthe parameters of the SDM, we can retrieve the model with spatially autocor-related errors.4 This spatial Durbin specification constitutes the benchmarkmodel of this paper.

    As mentioned above, individual and temporal heterogeneity are dealtwith by estimating a fixed or random effects specification. Anselin (1988,chap.10) is the first to suggest a one-way random effects specification withspatially autocorrelated errors (Anselins model hereafter). Several likeli-hood ratio (LR) and Lagrange multiplier (LM) statistics were derived byBaltagi et al. (2003) to assess the relevance of random individual effects andspatial autocorrelation in this model. These statistics were further general-ized to the presence of serial correlation by Baltagi et al. (2007). Also, Baltagiet al. (2009b) suggest LM statistics to test for the presence of heteroskedas-ticity and spatial autocorrelation in Anselins model. Alternatively, Kapooret al. (2007) (KKP hereafter) develop a method of moments estimator fora one-way random effects model where spatial autocorrelation is present in

    1Interactive heterogeneity should not be confused with what literature labels spatialheterogeneity and which refers to standard individual heterogeneity coming from spatialstructural instability in coefficients or residual variance.

    2The weight matrix W and all variables should be indiced by N since they formtriangular arrays. However, to keep some clarity in the notations, we omitted it.

    3In the context of cross-sectional models, several papers have also shown that theSDM was theoretically grounded. Among other, Ertur & Koch (2007, 2011) who revisitedneoclassical and schumpeterian growth theory between countries and Pfaffermayr (2009)who studied the convergence process among European regions.

    4For further details, see LeSage & Pace (2009, p.164).

    2

  • both components of the error term. These two different approaches led Bal-tagi et al. (2009) to propose a generalized random effects model that encom-passes KKP and Anselins model and to derive LM tests to choose betweenthe two specifications. According to Lee & Yu (2010b), this distinction isnot important in a fixed effects specification since individual-specific effectsenter the model as vectors of unknown fixed effects parameters. Recently,Lee & Yu (2010a,c) have established asymptotic properties of maximum like-lihood estimation of a spatial autoregressive panel data model with spatiallyautocorrelated errors with possibly both time and individual fixed effects.

    In applied work, the researcher often faces datasets where the time spanconsidered is small compared to the number of individuals. As such, in thispaper, we focus on the treatment of individual effects only and time dummiescan be added to the set of explanatory variables to account for time effects.

    Even though the literature provides the estimation procedures for bothfixed and random effects, the critical issue for the applied researcher is todetermine the best specification for individual heterogeneity. To this aim,Hausman (1978) derived a statistic to evaluate the relevance of the randomeffects specification with respect to the fixed effects model. In the spatialpanel data literature, Mutl & Pfaffermayr (2011) and Lee & Yu (2010b) studythe use of such a Hausman type statistic. The former paper concentrateson the KKP specification while the interest of the latter lies in Anselinsmodel. Finally, Lee & Yu (2010d) also propose a Lagrange multiplier statisticbased on the between equation to assess the relevance of the random effectsspecification.

    The originality of this paper is to suggest an alternative testing procedureto assess the relevance of the random effects specification. Our proceduregeneralizes Mundlak (1978) approach to a SDM panel data model. TheMundlak approach consists in augmenting the random effects specificationwith variables that should capture the correlation between regressors andindividual effects. In this paper, we suggest to go one step further and testfor the significance of these additional variables in order first to assess thetrade-off between bias and efficiency and second to identify the endogenouscovariates. This last point is of high interest since it can be used in the firststep of the methodology suggested by Hausman & Taylor (1981). These twoauthors propose to separate exogenous from endogenous regressors and touse functions of the former to find consistent estimators of the latter. Orig-inally, this distinction is based either on economic insights or some a prioriinformation the researcher might have. The Mundlak approach provides sta-tistical grounds to distinguish between exogenous and endogenous explana-tory variables and would constitute a real benefit. In that sense, Mundlaksapproach surpasses the Hausman test since the latter only detects the pres-ence of correlation. Let us however mention that in the Mundlak approach,the detection of dependence between individual effects and covariates is con-ditioned on the functional form assumed between the two while the Hausman

    3

  • test does not rely on such an assumption. Last but not least, as Mundlaksapproach is based on a random effects specification, it allows to consistentlyestimate the effect of time-constant variables even though regressors and in-dividual effects are not independent. So, it constitutes a solution for appliedresearchers dealing with time-constant variables but where the Hausman orLR statistic proposed here reject the null of independence between regres-sors and individual effects. The empirical application proposed in this paperperfectly illustrates the last point above. We estimate a house price modelon 588 Belgian municipalities to assess their relevant determinants. Amongthese determinants, we find time-constant variables that capture the attrac-tiveness of municipalities. These explanatory variables are the results of afactor analysis which summarizes twelve measures of attractiveness into threefactors, each one identifying a specific aspect of municipalities own charac-teristics, namely the charm of environment, the quality and availability ofservices provided and the quality of roads. As these factors are of inter-est, a random effects specification is required since estimating this model byfixed effects would result in the deletion of the time-constant covariates. Weshow that ignoring the correlation between regressors and individual leadsto unreliable estimated coefficients. However, augmenting this random ef-fects specification with variables that capture this endogeneity (i.e. applyinga Mundlak approach) provides estimators similar to those obtained with afixed effects specification (at least for time-varying covariates). Let us fur-ther notice that as the estimated model includes an endogenous spatial lag,the estimated coefficients are not directly interpretable and we need to relyon the matrix of partial derivatives associated with the reduced form of themodel to interpret the impact of a change in an explanatory variable on theprice of houses sold. This illustration shows that the Mundlak approach inspatial panel data is extremely useful to estimate the effect of time-constantvariables and furthermore allows for some correlation between regressors andindividual effects, which is a clear advantage over the traditional random ef-fects specification.

    The remainder of the paper is organized as follows: Section 2 describesthe model used and the testing methodology. Monte Carlo simulations toassess the properties of the method in finite samples are performed in Section3. Section 4 presents an empirical illustration of the proposed methodologywhile Section 5 concludes.

    2 Methodology

    Our benchmark model is the one-way error component SDM presented in(3).

    Yt = WYt +Xt +WXt +Ut

    Ut = +Vt, t = 1, . . . , T(3)

    4

  • Yt is the N-dimensional vector of the dependent variable in period t, W isthe spatial weight matrix that models the interaction scheme between theN individuals, and the spatial autoregressive parameter to be estimated.Also, Xt is the N K matrix of explanatory variables while WXt rep-resents the matrix of spatially lagged explanatory variables. When W isrow-stochastic and Xt contains an intercept, this intercept should not beincluded in the spatial lag of the explanatory variables since WN = N .Ut is the N -dimensional vector of errors containing both individual effects() and a vector of innovation terms Vt = (V1,t, V2,t, . . . , VN,t)

    , where Vi,tis iid across i and t with zero mean and variance 2. Finally, as alreadymentioned, the typical dataset available to the applied researcher containsmany individuals compared to the number of periods of time. In case wewish to control for time effects, we can add time dummies to equation (3).These dummies will not change the results since they should be consideredas control variables distinct from the covariates and their spatial lags. Forthe sake of clarity, we will not write them explicitly in the model.

    According to Mundlak (1978), the random effects specification is a mis-specified version of the fixed effects (within) model since it ignores the pos-sible correlation between individual effects and regressors. By controllingfor this correlation, Mundlak shows that coefficients of the random effectsspecification are identical to those of the within estimation unifying in thisway both approaches. He thus proposes to set an auxiliary regression thatwill capture this possible relation.

    Before writing down this auxiliary regression for model (3), we rewritein (4) the model for individual i in period t. We observe that the dependentvariable for i depends on the explanatory variables in i but also on explana-tory variables in its neighborhood. So, the auxiliary regression should alsocapture the correlation between the individual effect (i) and the neighboringcovariates.

    Yi,t = Wi.Yt +Xi,t +Wi.Xt + Ui,t

    Ui,t = i + Vi,t(4)

    where Wi. is the ith row of W. The baseline auxiliary regression used inthis paper is presented in equation (5). Let us note that as i is by definitiontime-invariant, it should only be correlated with the time-invariant part ofexplanatory variables.

    i = Xi.pi +Wi.X.+ i. (5)

    where Xi. = 1/TT

    t=1Xi,t is the average over time of X for individual i andi. IN(0,

    2).

    The independence assumption between individual effects and regressorsis violated when, in equation (5), coefficient vectors pi and are significant.In such a case, model (3) should include individual fixed effects. If the null of

    5

  • non-significance cannot be rejected, individual heterogeneity in specification(3) is best modeled by random effects. In this paper, the significance of thesetwo vectors is assessed with a LR statistic. This LR test over pi and is away to assess the trade-off existing between bias and efficiency. For instance,if the results indicate a barely significant LR statistic, the applied researchercould wonder whether a random effects specification, even though slightlybiased, should not be preferred to a fixed effects model due to its gain in theprecision of the estimated parameters. However, if the LR statistic is highlysignificant, the gain in efficiency of the random effects does not compensatethe bias generated and the fixed effects should be preferred.

    When the LR statistic is significant, it can be of interest to identify theendogenous regressors. To do so, we look at the individual significance ofthe regressors used in the auxiliary regression. This identification mightbe important when variables of interest are time-invariant. In that case, thefixed effects specification is useless while the Hausman-Taylors methodologycould be helpful. In their seminal paper, these two authors propose to splitthe regressors (both time variant and invariant) into two sets, dependingon their relation with individual effects. the first set contains all regres-sors considered (a priori) as exogenous while the second include regressorsthat (a priori) could be correlated with individual heterogeneity. They thensuggest to instrument endogenous regressors (both varying and non-varyingover time) by functions of time-varying exogenous covariates. The analysisof significance of these control variables can be a first solution to help the ap-plied researcher to determine time-varying exogenous explanatory variableson statistical grounds.

    Denote Wa = IT W, the weight matrix for the whole sample andZ = (T IN ), a matrix of dummy variables with T a T -dimensional vectorof ones. We can write model (3) stacked for all time periods as follows:

    Y = WaY +X +WaX +U

    U = Z+V(6)

    Writing the auxiliary regression set forth in (5) for all individuals and alltime periods yields:

    Z =(JT IN

    )(Xpi +WaX+ )

    = P(Xpi +WaX+ )(7)

    with JT =T

    T

    Tthe operator computing averages over time. Expression

    (7) comes from use of the projection matrix on the column space of Z.Plugging this expression into (6) yields equation (8), which is a one-wayerror component model that has to be estimated by random effects.

    Y = WaY +X +WaX +PXpi +PWaX +U

    U = P+V(8)

    6

  • To estimate model (8), the standard GLS approach has to be avoidedsince it provides biased and inconsistent estimators due to the correlationbetween the spatially lagged dependent variable (WaY) and the error termU. Maximum likelihood estimation procedure for the random effects spatialpanel data model has been proposed by Elhorst (2003, 2010) and Lee & Yu(2010b,d).

    The log-likelihood function associated to equation (8) is presented below.

    lnL = NT

    2ln 2pi

    1

    2ln ||+ T ln |IN W|

    1

    2U1U (9)

    where U = Y WaY X WaX PXpi PWaX and is theusual variance matrix of a one-way error component model that takes thefollowing form:

    = (T2 + 2)(

    1

    TT

    T IN ) + 2(IT

    1

    TT

    T IN) (10)

    Lee & Yu (2010d) develop and discuss the assumptions underlying the asymp-totic properties for the (quasi-) maximum likelihood estimator. DefiningS() = IN W, we write the adapted version of this set of assumptions forthe model considered as follows:

    Assumption 1. The interaction matrixW is non-stochastic, constant through

    time with diagonal elements equal to 0. This matrix is also assumed to be

    uniformly bounded in row and column sums in absolute value.

    Assumption 2. Individual effects i. and the disturbances Vi,t are inde-pendent from the regressors Z = [X, WaX, PX, WaPX]. Also, i. IN(0, 2) and Vi,t IN(0,

    2).

    Assumption 3. S() is invertible for all P where P is a compactparameter space. We also have that S()1 is uniformly bounded in row andcolumn sums in absolute value for all P .

    Assumption 4. N is large and T is finite.

    Assumption 5. The regressor matrix Z has full column rank and its ele-

    ments are non-stochastic and uniformly bounded in N and T . Also, underthe asymptotic setting of Assumption 4, the limit of 1

    NTZ1Z exists and

    is nonsingular. We furthermore assume that all parameters are identified.

    When the interaction matrix is row-normalized, the parameter space P isa compact subset in (-1,1).5 The asymptotic setting defined in Assumption4 corresponds to what is usually encountered in applied work, meaning smallnumber of periods compared to a large number of individuals.

    The next section assesses the properties of the Mundlak approach ex-tended to spatial Durbin panel data models through Monte Carlo experi-ments.

    5See Kelejian & Prucha (2010, p. 55) for a discussion on the parameter space for thespatial autoregressive parameter.

    7

  • 3 Monte Carlo simulations

    The benchmark model for the Monte Carlo simulations is presented in equa-tion (11):

    yt = Wyt + x1,t + x2,t + 0.5Wx1,t + 0.5Wx2,t + ut

    ut = + vt, t = 1, . . . , T.(11)

    For these Monte Carlo experiments, two different spatial schemes are gen-erated and both are row-normalized. In the first one, W is defined as firstorder contiguity under the queen sense. For the second spatial scheme, weuse a 15 nearest neighbors definition. Observations are drawn from a squaregrid containing respectively 49, 144 and 256 observations. Furthermore, twodifferent time span are considered: T = 5, 10. The spatial autoregressiveparameter varies over the set [0.6, 0.6] by increment of 0.2. The explana-tory variables x1 and x2 are formed from a time-invariant component and anidiosyncratic term, both drawn from standard normal distributions. In otherwords, x1i,t = 1i + 1i,t and x2i,t = 2i + 2i,t. Also, vi,t is drawn from anindependent standard normal distribution. Finally, three configurations areexplored for the individual effects () and summarized in Table 1. In DGP1, we consider i as pure random effects drawn from a normal distributionwith a standard deviation of two. In DGP 2, individual effects are assumedto be correlated only with the time averaged value of x1 and its spatial lagWx1. In the last DGP, individual effects are assumed to be correlated withthe time average value of all regressors (xi. = [x1,i., x2,i.]). The parameter captures the intensity of correlation and takes on three different values:0.1, 0.5 and 0.8 that reflect low, medium and high dependence. Each case isreplicated 1000 times. In Tables 2 to 4 that are discussed below, in additionto presenting the LR test proposed in this paper, we provide results of theHausman test developed by Lee & Yu (2010b) and compare outcomes of bothapproaches.

    Table 1: Different DGPs for individual effects

    DGP 1 DGP 2 DGP 3i = i i = (x1,i. +Wx1,i.) + i i = (xi. +Wxi.) + i

    i N(0, 2) i N(0, 2) i N(0, 2)

    Tables 2 to 4 present Monte Carlo results for both the LR statistic andthe Hausman test to assess the relevance of the random effects specification.Figures correspond to rejection rates of the null of absence of correlationbetween regressors and individual effects. In DGP 1, we assess the size ofboth statistics, set to a theoretical value of 5% and highlighted in bold inthe Tables. DGP 2 and 3 study some measure of the power of the twoaforementioned tests. All tables are split into two. The upper panel displays

    8

  • outcomes when the contiguity in the queen sense is used while the lowerpanel makes use of the 15 nearest neighbors interaction scheme.

    Table 2 summarizes results forN = 49. We first observe that in this smallsample case, the size of the LR statistic, even though a bit high, is not faraway from the theoretical 5%. The Hausman test is much more liberal butthis could be explained by the fact that it requires much more observations tofollow a 2 distribution. Indeed, the size of the Hausman test is closer to itsnominal value of 5% when N = 144. In the DGP 2, where only one covariate(and its spatial lag) is correlated with individual effects, the low correlationcase ( = 0.1) is characterized by low power of both tests. However, whenthe correlation grows ( = 0.5, 0.8) the rejection rate increases. Let usnevertheless note that due to the small sample size (N = 49) the rejectionrate does not go to 1. This weakness is nonetheless solved when the number ofindividuals increases. The results of DGP 3 confirm those of DGP 2 but bothtests have a higher power when = 0.5 or 0.8. When the correlation is weak,however, their power is small. Concerning the time span, rejection rates arehigher when more periods are available. This result is not surprising sincefor the LR test, more observations can be used to compute the time-averageof regressors that capture the possible correlation with individual effects.Concerning the Hausman test, more observations implies higher precision ofthe estimated parameters and thus a higher power of the statistic.

    Table 3 presents the results for N = 144. We first note that size of bothstatistics are closer to their nominal value. Without surprise, we observe thatfor DGP 2, the rejection rate in case of medium or high correlation greatlyincreases. For weak correlation, the power of the LR and Hausman statisticsare around 10%. Lastly, results of DGP 3 indicate a slight increase in powerfor the weak correlation case and a full rejection rate when a medium or highcorrelation between all regressors and individual effects is generated.

    Results of the Monte Carlo experiments when N = 256 are summarizedin Table 4. Empirical sizes (highlighted in bold) correspond to their nominalvalue. We further observe that for both DGP 2 and 3, the power of the LRtest (Mundlaks approach) and of the Hausman test increases compared tothe N = 144 case.

    Results for = 0.1 illustrate the trade-off between bias and efficiency.Even though theoretically the random effects model should be rejected,Monte Carlo experiments conclude to a low rejection rate, indicating thatin presence of weak correlation, the random effects specification does notperform so bad. Referring to the Hausman statistic, it seems that coefficientestimates obtained under the random effects and fixed effects specificationdo not greatly differ one from the other.

    9

  • Table 2: Monte Carlo results for Mundlak and Hausman approaches, N = 49Contiguity, N = 49, T=5

    DGP 1 DGP 2 DGP 3 = 0.1 = 0.5 = 0.8 = 0.1 = 0.5 = 0.8

    LR Hausman LR Hausman LR Hausman LR Hausman LR Hausman LR Hausman LR Hausman-0.6 0.062 0.079 0.09 0.118 0.487 0.441 0.887 0.814 0.095 0.105 0.731 0.672 1 0.887-0.4 0.063 0.083 0.081 0.095 0.477 0.431 0.902 0.835 0.106 0.097 0.765 0.707 1 0.9-0.2 0.063 0.075 0.096 0.119 0.503 0.473 0.893 0.865 0.089 0.099 0.745 0.692 1 0.960.2 0.079 0.132 0.082 0.134 0.502 0.48 0.892 0.858 0.096 0.131 0.787 0.749 0.999 0.9930.4 0.079 0.142 0.074 0.166 0.486 0.493 0.904 0.845 0.087 0.151 0.78 0.741 1 0.9970.6 0.072 0.181 0.087 0.195 0.483 0.484 0.909 0.844 0.089 0.173 0.78 0.747 1 0.998

    Contiguity, N = 49, T=10-0.6 0.067 0.052 0.076 0.064 0.489 0.379 0.917 0.863 0.093 0.082 0.996 0.969 1 0.941-0.4 0.079 0.051 0.082 0.07 0.519 0.419 0.916 0.862 0.091 0.087 0.992 0.979 1 0.963-0.2 0.075 0.064 0.1 0.078 0.511 0.434 0.919 0.885 0.09 0.085 0.992 0.985 1 0.970.2 0.076 0.097 0.102 0.109 0.521 0.505 0.925 0.89 0.107 0.138 0.989 0.981 1 0.9920.4 0.063 0.139 0.103 0.131 0.491 0.483 0.913 0.884 0.089 0.172 0.994 0.989 1 0.9880.6 0.069 0.164 0.093 0.158 0.473 0.476 0.924 0.885 0.106 0.185 0.991 0.981 1 0.978

    15 nearest neigbors, N = 49, T=5 = 0.1 = 0.5 = 0.8 = 0.1 = 0.5 = 0.8

    LR Hausman LR Hausman LR Hausman LR Hausman LR Hausman LR Hausman LR Hausman-0.6 0.077 0.184 0.092 0.163 0.553 0.555 0.924 0.884 0.093 0.168 0.761 0.727 0.94 0.874-0.4 0.085 0.199 0.085 0.164 0.543 0.541 0.926 0.887 0.078 0.174 0.789 0.766 0.944 0.877-0.2 0.07 0.201 0.099 0.196 0.531 0.527 0.934 0.919 0.074 0.185 0.779 0.752 0.925 0.8630.2 0.074 0.246 0.101 0.234 0.55 0.571 0.936 0.9 0.08 0.214 0.765 0.755 0.936 0.8750.4 0.072 0.227 0.092 0.191 0.546 0.563 0.933 0.89 0.074 0.204 0.766 0.758 0.94 0.8810.6 0.079 0.206 0.091 0.203 0.522 0.516 0.931 0.889 0.074 0.205 0.776 0.785 0.926 0.878

    Inverse distance, N = 49, T=10-0.6 0.076 0.122 0.086 0.158 0.555 0.511 0.926 0.89 0.088 0.152 0.887 0.822 0.982 0.908-0.4 0.059 0.142 0.092 0.181 0.566 0.545 0.936 0.919 0.092 0.159 0.86 0.813 0.987 0.863-0.2 0.06 0.162 0.092 0.21 0.579 0.568 0.947 0.911 0.091 0.191 0.875 0.821 0.988 0.8470.2 0.088 0.231 0.081 0.215 0.58 0.574 0.937 0.899 0.097 0.246 0.868 0.856 0.99 0.8090.4 0.067 0.22 0.072 0.205 0.56 0.519 0.935 0.921 0.078 0.222 0.861 0.838 0.988 0.790.6 0.077 0.225 0.089 0.221 0.542 0.475 0.947 0.921 0.083 0.246 0.878 0.857 0.989 0.839

    10

  • Table 3: Monte Carlo results for Mundlak and Hausman approaches, N = 144Contiguity, N = 144, T=5

    DGP 1 DGP 2 DGP 3 = 0.1 = 0.5 = 0.8 = 0.1 = 0.5 = 0.8

    LR Hausman LR Hausman LR Hausman LR Hausman LR Hausman LR Hausman LR Hausman-0.6 0.062 0.065 0.097 0.087 0.951 0.932 1 1 0.134 0.134 1 1 1 1-0.4 0.049 0.052 0.104 0.096 0.952 0.943 1 1 0.15 0.137 1 1 1 1-0.2 0.062 0.061 0.114 0.1 0.96 0.944 1 1 0.153 0.143 1 1 1 10.2 0.057 0.072 0.101 0.105 0.962 0.939 1 1 0.159 0.159 1 1 1 10.4 0.057 0.078 0.102 0.106 0.955 0.936 1 1 0.144 0.145 1 1 1 10.6 0.048 0.065 0.092 0.107 0.958 0.936 1 1 0.154 0.166 1 1 1 1

    Contiguity, N = 144, T=10-0.6 0.059 0.048 0.112 0.088 0.964 0.948 1 1 0.141 0.114 1 1 1 1-0.4 0.06 0.057 0.1 0.08 0.95 0.939 1 1 0.157 0.131 1 1 1 1-0.2 0.061 0.052 0.105 0.1 0.948 0.929 1 1 0.14 0.124 1 1 1 10.2 0.078 0.077 0.104 0.101 0.972 0.958 1 1 0.12 0.122 1 1 1 10.4 0.056 0.065 0.096 0.104 0.963 0.949 1 1 0.143 0.139 1 1 1 10.6 0.061 0.097 0.105 0.118 0.97 0.959 1 1 0.141 0.167 1 1 1 1

    15 nearest neighbors, N = 144, T=5 = 0.1 = 0.5 = 0.8 = 0.1 = 0.5 = 0.8

    LR Hausman LR Hausman LR Hausman LR Hausman LR Hausman LR Hausman LR Hausman-0.6 0.061 0.069 0.089 0.098 0.908 0.878 1 1 0.154 0.15 1 1 1 1-0.4 0.045 0.065 0.085 0.098 0.911 0.892 1 1 0.154 0.157 1 1 1 1-0.2 0.064 0.083 0.09 0.118 0.907 0.881 1 1 0.144 0.149 1 1 1 10.2 0.069 0.102 0.093 0.139 0.911 0.892 1 1 0.159 0.185 1 1 1 10.4 0.052 0.103 0.082 0.133 0.913 0.896 1 1 0.133 0.189 1 1 1 10.6 0.068 0.144 0.096 0.172 0.913 0.88 1 1 0.158 0.216 1 1 1 1

    15 nearest neighbors, N = 144, T=10-0.6 0.055 0.059 0.099 0.087 0.954 0.931 1 1 0.16 0.138 1 1 1 1-0.4 0.054 0.059 0.093 0.112 0.953 0.936 1 1 0.153 0.153 1 1 1 1-0.2 0.054 0.074 0.077 0.108 0.957 0.948 1 1 0.139 0.144 1 1 1 10.2 0.054 0.083 0.092 0.125 0.941 0.924 1 1 0.162 0.179 1 1 1 10.4 0.052 0.105 0.092 0.127 0.945 0.928 1 1 0.163 0.18 1 1 1 10.6 0.058 0.127 0.098 0.154 0.958 0.943 1 1 0.156 0.21 1 1 1 1

    11

  • Table 4: Monte Carlo results for Mundlak and Hausman approaches, N = 256Contiguity. N = 256. T=5

    DGP 1 DGP 2 DGP 3 = 0.1 = 0.5 = 0.8 = 0.1 = 0.5 = 0.8

    LR Hausman LR Hausman LR Hausman LR Hausman LR Hausman LR Hausman LR Hausman-0.6 0.047 0.047 0.177 0.171 0.998 0.998 1 1 0.201 0.186 1 1 1 1-0.4 0.038 0.048 0.148 0.138 0.995 0.992 1 1 0.183 0.165 1 1 1 1-0.2 0.048 0.058 0.140 0.122 0.998 0.998 1 1 0.193 0.184 1 1 1 10.2 0.069 0.062 0.134 0.137 0.997 0.996 1 1 0.198 0.186 1 1 1 10.4 0.068 0.071 0.137 0.131 0.998 0.995 1 1 0.2 0.194 1 1 1 10.6 0.057 0.075 0.158 0.164 1 0.995 1 1 0.212 0.194 1 1 1 1

    Contiguity. N = 256. T=10-0.6 0.054 0.055 0.131 0.127 1 0.996 1 1 0.252 0.216 1 1 1 1-0.4 0.051 0.055 0.134 0.129 1 1 1 1 0.295 0.245 1 1 1 1-0.2 0.051 0.051 0.14 0.122 0.998 0.997 1 1 0.251 0.23 1 1 1 10.2 0.054 0.053 0.136 0.133 1 1 1 1 0.254 0.229 1 1 1 10.4 0.05 0.05 0.132 0.133 1 0.998 1 1 0.248 0.233 1 1 1 10.6 0.05 0.074 0.151 0.144 1 1 1 1 0.224 0.225 1 1 1 1

    15 nearest neigbors. N = 256. T=5 = 0.1 = 0.5 = 0.8 = 0.1 = 0.5 = 0.8

    LR Hausman LR Hausman LR Hausman LR Hausman LR Hausman LR Hausman LR Hausman-0.6 0.058 0.063 0.137 0.138 1 1 1 1 0.197 0.183 1 1 1 1-0.4 0.052 0.068 0.135 0.127 1 1 1 1 0.192 0.175 1 1 1 1-0.2 0.037 0.058 0.128 0.119 1 1 1 1 0.206 0.193 1 1 1 10.2 0.063 0.078 0.143 0.161 1 1 1 1 0.209 0.2 1 1 1 10.4 0.042 0.061 0.122 0.143 1 1 1 1 0.207 0.212 1 1 1 10.6 0.058 0.083 0.127 0.144 1 1 1 1 0.216 0.218 1 1 1 1

    15 nearest neigbors. N = 256. T=10-0.6 0.054 0.064 0,112 0,113 1 1 1 1 0.262 0.233 1 1 1 1-0.4 0.049 0.047 0,119 0,117 1 1 1 1 0.258 0.226 1 1 1 1-0.2 0.044 0.051 0,117 0,117 1 1 1 1 0.244 0.203 1 1 1 10.2 0.055 0.058 0,123 0,123 1 1 1 1 0.23 0.232 1 1 1 10.4 0.044 0.071 0,118 0,136 1 1 1 1 0.234 0.227 1 1 1 10.6 0.053 0.084 0,115 0,145 1 1 1 1 0.239 0.241 1 1 1 1

    12

  • 4 Empirical application

    To illustrate the Mundlak approach extended to spatial Durbin models, weestimate a spatial econometrics house price model to explain price variationsacross 588 municipalities in Belgium over the period 2004 to 2007.

    The literature concerning house prices determination has since a longtime recognized the importance of spatial spillovers when modeling the de-terminants of house prices. The first papers treating these spatial spilloversindeed date back to Can (1990, 1992) and since, many studies devoted tohouse prices models explicitly accounted for interactions (spillovers) in suchmodels (see among others Dubin et al. 1999, Tse 2002, Beron et al. 2004,Osland 2010). In this literature, spatial spillovers are labeled adjacencyeffects and refer to price differentials that cannot be justified only on thebasis of housing services. For instance, the fact that premium householdsare willing to pay just for the snob value of a particular location (Can &Megbolugbe 1997).

    The house prices specification to be estimated in this paper comes fromthe market clearing function produced by the interaction of bid functions ofhouseholds and offer functions of suppliers. The original theoretical modelhas been developed by Rosen (1974) and extended to the presence of ad-jacency effects by Fingleton (2008). Besides, Fingleton (2010) proposes ageneralization of the specification set out in Fingleton (2008) to panel datamodels and constitutes the basis of the specification we estimate in thispaper, shown in equation (12).

    Pt = WPt +X + + t, t = 1, . . . , T (12)

    In this model, Pt is the vector of housing price (expressed in logs) for allmunicipalities in period t, WPt is the endogenous spatial lag that capturesspillovers (or adjacency effects) with W the weight matrix modeling theinteraction scheme between municipalities, Xt is aNK matrix of covariatesthat will be explained below and is the K-dimensional vector of associatedcoefficients. Finally, is the N -dimensional vector of individual effects, whilet is the traditional idiosyncratic error term, assumed normally distributedwith zero mean and constant variance 2.

    The dataset considered concerns 588 Belgian municipalities from 2004 to2007.6 The house prices considered are the average over each municipality ofordinary houses sold during one year. Our house price variable thus excludesflats, villas and development sites.7 The matrix of covariates Xt containsboth time-varying and time-constant variables. The former includes net

    6Belgium counts 589 municipalities but we were obliged to disregard Herstappe sincethis municipality is so small that we could not get house prices data due to confidentialityissues.

    7The dataset comes from the Belgian Directorate-general Statistics and Economic in-formation and all relevant data are expressed in constant 2004 Euros prices.

    13

  • income per capita expressed in logs (lincome), which proxies for the wealth ofa municipality and whose effect is assumed positive, the average of the surfacesold also expressed in logs (lsurf)that control for the size of house sold(including the land), and its spatial lag (Wlsurf), that controls for the effectof houses size sold in the neighboring on the house price in the concernedmunicipality. The impact of this surface variable is also assumed positive.The last time-varying variable included is the density (dens) which serves as ameasure of urbanization of the municipality and whose effect is thought to bepositive due to competition effects between potential buyers. Time-constantexplanatory variables include nine provinces dummies (PRp for p = 1, . . . , 9)that capture the belonging of a municipality to a broader geographic andadministrative entity as well as three factors measuring the attractivenessof a municipality. These factors result from a factor analysis with varimaxrotation whose outcomes are presented in Table 5. The initial 12 indicatorsthat serve the analysis are reported on the first column of Table 5. Theseindicators were computed from the 2001 socio-economic study performed bythe National Institute of Statistic and correspond to satisfaction indexes.8

    Table 5 shows that each of the three factors determines a specific aspectof a municipalitys attractiveness. The first factor (env_cha) refers to thecharm of environment and gathers all characteristic of pleasant surroundings,namely attractiveness of buildings, air quality, calm and open space. Thesecond factor (pri_fac) captures the presence of private facilities, meaningthe quality and availability of shopping facilities, profession services (doctors,hairdressers, . . .), health services and transport facilities. The third factor(road_qua) refers to roadway quality and is related to the quality of roads,cycle tracks and pavements. The last column of Table 5, with the headeruniqueness is the variance percentage of the variable not explained by thefactors. In this application, we observe uniqueness values quite low (close tozero), which indicates that variables are well explained by the factors.

    The measures of attractiveness considered (charming environment,privatefacilities and roadway quality) have an interest per-se since they allow to de-termine the municipalitys characteristics relevant to explain house prices. Toestimate their impact, we thus have to rely on a random effects estimationand assume that the individual effects are independent from the regressorsand also normally distributed with zero mean and constant variance 2.

    9

    The (pseudo-)within transformation proposed by Lee & Yu (2010a) to esti-mate a fixed effects model would wipe out the time-invariant components ofthe regression.

    In this illustration, we considered several interaction matrices but re-ported only results concerning the contiguity based one that has been row-

    8For further details, the interested reader may consult Denil et al. (2004).9relaxing the normality assumption is possible as long as we estimate the model with

    quasi-maximum likelihood.

    14

  • Table 5: Results of the factor analysisFactor 1 Factor 2 Factor 3 Uniqueness

    Build. attrac. 0.96 -0.03 0.04 0.07Cleanness 0.94 0 0.21 0.08

    Air 0.85 -0.28 -0.13 0.19Calm 0.85 -0.32 0.01 0.17

    Open space 0.78 0.05 0.13 0.36Professions services 0.03 0.89 0.33 0.11Shopping facilities -0.25 0.87 0.2 0.14

    Health services -0.07 0.79 0.3 0.28Transport facilities -0.44 0.63 0.24 0.35

    Roads 0.21 0.33 0.84 0.14Cycle tracks 0.16 0.3 0.81 0.24Pavements -0.3 0.43 0.64 0.31

    normalized. 10

    When the assumption of independence between regressors and individualeffects is violated, random effects estimators are biased and inconsistent.We propose to apply the Mundlak approach proposed in this paper to dealwith this potential endogeneity problem and we compare the results obtainedwith this approach to a fixed effects specification (for time-varying variables)where the correlation between regressors and individual effects is not an issueof concern.

    Before presenting the estimation results, it is important to note thatequation (12) is an implicit form. To assess the impact of an explanatoryvariable on the house price variable, we first need to compute its reducedform, shown in (13), and then the matrix of partial derivatives.

    Pt = (IN W)1(0 + 1lincomet + 2surft + 3Wsurft + 4dens)

    + (IN W)1(5env_cha+ 6pri_fac+ 7road_qua)

    + (IN W)1(

    9p=1

    PRp + + t) (13)

    This model implies that house price level in a municipality spill overs munici-palities. It is thus possible to assess the impact of a change in an explanatoryvariable, the income for instance, in municipality i on the house price levelin this municipality i but also on house price levels in all other municipalitiesj 6= i of the sample.

    10the other interaction matrices used are a 10-nearest neighbors and an inverse distancebased matrix with a threshold, where the threshold took several values. All these matricesgave similar results.

    15

  • The matrix of partial derivatives of Pt with respect to the covariate ofinterest, for instance income per capita, labeled Yt for notational clarity, ispresented below:

    YP PtYt

    = 1 (IN W)1 = 1(IN + W +

    2W2 + ...) (14)

    When the spatial lag of the covariate is also included in the specifica-tion, the surface sold for instance, the matrix of partial derivatives takes thefollowing form:

    SP Pt

    surf t= (IN2 + 3W) (IN W)

    1 (15)

    where the main difference with expression (14) is the presence of the addi-tional term 3W.

    The diagonal elements of this partial derivative matrix contain the directimpacts including own spillover effects, which are inherently heterogeneousin presence of spatial autocorrelation due to differentiated weights in theW matrix, whereas off-diagonal elements represent indirect impacts. Usingobvious notations, we have, for the impact of income per capita on houseprice levels:

    Pt,iYt,i

    (YP )t,ii andPt,iYt,j

    (YP )t,ij (16)

    The own spillover effects correspond to the feedbacks from municipality jto i when municipality i affects j as well as longer paths which might go frommunicipality i to j to k and back to i. The magnitude of those direct effectsdepends on: (1) the degree of interactions between municipalities (governedby the W matrix), (2) the parameter measuring the strength of spatialdependence between municipalities and (3) the estimated parameter of thecovariate of interest, . Note also that the magnitude of pure feedback effectsare then given by (Y

    P)t,ii , where could be interpreted as representing

    the direct impact of per capita income if there was no spatial autocorrelation,i.e. if was equal to zero.

    Cumulative indirects effects can be computed into two different ways,with two complementary economic intuitions. If we want to examine how achange of a covariate in municipality i will affect the house price levels in allothers municipalities j 6= i, we sum all elements but the diagonal one of theith column of the partial derivative matrix. Economically, this interpretationcan be used to simulate economic policy scenarii since it allows to studythe total diffusion over space of a shock given in a specified municipality.Alternatively, we can sum all elements excepted the diagonal one of the ith

    row of the matrix of partial derivative. By doing so, we analyze how achange in a covariate in all municipalities j 6= i will affect house price levelin municipality i.

    16

  • Table 6: Estimation results for the three methods

    Dependent variable: Random effects Mundlak approach Fixed effectslog of house price (P)

    lincome 0.293 0.183 0.149(0.000) (0.000) (0.003)

    lsurf 0.075 0.077 0.077(0.000) (0.000) (0.000)

    Wlsurf -0.074 -0.022 -0.028(0.000) (0.136) (0.048)

    density 0.002 0.018 0.016(0.000) (0.001) (0.005)

    env_cha 0.053 0.056 -(0.000) (0.000)

    pri_fac 0.027 0.022 -(0.000) (0.000)

    road_qua 0.018 0.015 -(0.010) (0.036)

    0.837 0.829 0.852(0.000) (0.000) (0.000)

    Specification 62.7521 47.7362

    Test (0.000) (0.000)

    Figures between brackets correspond to p-values. Provinces dummies were included inthe three specifications but were not shown since of not interest per-se.

    1 This test is the traditional Hausman statistic.2 This test is the LR statistic mentioned above in the paper.

    We finally define the average direct effect as the average of diagonalelements of the partial derivative matrix, namely N1tr(Y

    P) when look-

    ing at the impact of income per capita and the average cumulative indi-rect effect which corresponds to the average of columns of rows sums ofthis partial derivative matrix, cleaned of the diagonal elements, i.e. (N

    1)1N

    (Y

    P diag(Y

    P))N .

    11

    Results of the estimation of equation (12) by random effects, Mundlakapproach and fixed effects specifications are summarized in Table 4. Weobserve that the Hausman test is significant, indicating that the fixed effectsestimation should be preferred to the traditional random effects model sincethe assumption of independence between regressors and individual effectsseems violated. We thus apply the Mundlak approach and add variables tocapture this correlation. These additional variables are the averages over

    11LeSage & Pace (2009, chap. 2) present a comprehensive analysis of those effects alongwith some useful summary measures in the cross-section setting. The extension to staticpanel data models is easily done.

    17

  • time of the time-varying covariates as well as their spatial lags. The LR testat the bottom of column 3 of Table 4 indicates that these additional variablescapture at least part of this correlation since the null of non-significance isstrongly rejected.

    To discuss estimation results, we rely on impacts computed from thematrix of partial derivatives of the dependent variable with respect to eachof explanatory variables. The averaged direct and total indirect impacts foreach covariate and for the three estimation techniques are reported in Table7. We also report 99% confidence intervals for these impacts constructedfrom 10000 Monte Carlo draws.12

    12The interested reader may consult LeSage & Pace (2009, chap. 5) for further details.

    18

  • Table 7: Impacts computation for the three estimation methods

    Dependent variable: Random effects Mundlak approach Fixed effectslog of house price (P) Average Direct Average Indirect Average Direct Average indirect Average Direct Average Indirect

    lincome 0.371 1.426 0.23 0.837 0.191 0.81(0.263) (0.471) (0.989) (1.995) (0.093) (0.364) (0.355) (1.379) (0.056) (0.327) (0.241) (1.481)

    lsurf 0.071 -0.065 0.091 0.236 0.089 0.238(0.053) (0.09) (-0.201) (0.064) (0.076) (0.119) (0.031) (0.442) (0.066) (0.113) (0.01) (0.485)

    density 0.003 0.26 0.023 0.084 0.02 0.086(0.002) (0.004) (0.188) (0.350) (0.005) (0.041) (0.017) (0.152) (0.002) (0.038) (0.01) (0.169)

    env_cha 0.067 0.258 0.069 0.253 - -(0.051) (0.083) (0.187) (0.347) (0.051) (0.087) (0.18) (0.352)

    pri_fac 0.034 0.13 0.028 0.102 - -(0.018) (0.051) (0.071) (0.208) (0.01) (0.046) (0.041) (0.178)

    road_qua 0.023 0.087 0.018 0.067 - -(-0.001) (0.045) (-0.002) (0.178) (-0.005) (0.04) (-0.017) (0.155)

    Figures between brackets are the lower and upper bounds of a confidence interval at 99% constructed using 10000 MC draws.

  • Average direct and total indirects impacts under the random effects spec-ification (collected in the two first columns of table 7) differ from those ob-tained with the two others estimation procedures, reinforcing the Hausmanstatistic result. For instance the direct elasticity of income per capita onhouse price level is 0.371 compared to 0.23 and 0.191 in the Mundlak andfixed effects specification. For the density variable, impacts for the randomeffects specification are eight times smaller than those from the two otherspecifications (0.003 against 0.023 and 0.02). These results confirm that ran-dom effects estimators are not reliable and the focus of attention will insteadbe on the interpretation of impacts in the Mundlak specification.

    Impacts computed in the Mundlak and fixed effects approaches are reallysimilar, which implies that the correlation between regressors and individualeffects is well captured by the auxiliary controls. The average direct elastic-ity of income per capita on house price is positive and significant, supportingthe results obtained by Fingleton (2008, 2010). Hence, an increase in theincome per capita of 10% in a municipality will increase the level of housesprice in this municipality by 2.3%. The direct elasticity of the surface sold isalso positive and significant, a result consistent with the economic intuitionthat larger the house sold, higher is the price. Results also indicate thatthe density has a positive impact on the levels of house price. This effectcan be explained by competition effects, reflecting a disequilibrium betweendemand and supply of housing goods. Finally, direct impacts for two of thethree factors measuring attractiveness of a municipality are significant. Theenvironments charm and the presence of private facilities have positive im-pact on the price of houses sold. Besides, we also observe that the quality ofroads (including cycle tracks and pavements) does not significantly affect theprice level. House buyers are thus more affected by environments quality,which contributes to their feeling of well-being and the quality and availabil-ity of services present in the municipality (avoiding frequent trips to a city)than by quality of roads, which can be viewed as a pure technical detail ofthe municipality.

    Average total indirects impacts are also all positive and significant exceptfor the last factor (road_qua). For instance, increasing the income percapita of 10% in a municipality will cause the house price levels in all othermunicipalities to increase, on average, by 8%. Using the row interpretation,we would say that increasing the density of one unit in all municipalitiesexcept in i will cause the price of houses sold in i to increase by around8.4%.

    5 Conclusion

    This paper extends the Mundlak approach to the spatial Durbin model andpropose a LR test to assess the relevance of the random effects specifica-

    20

  • tion in this framework. The Monte Carlo experiments indicate that in verysmall samples, the size of the LR test behaves much better than the oneof Hausman test. Even though the Hausman or the LR test concludes tothe violation of the independence between regressors and individual effects,the Mundlak specification still permits the estimation of time-constant vari-ables while accounting for the endogeneity problem. Naturally, the extent towhich this correlation is captured depends on the functional form set in theauxiliary regression. To illustrate the usefulness of the Mundlak approach inspatial models, we estimate a house price regression for 588 Belgian munici-palities, where some of the explanatory variables measure the attractivenesssof these municipalities and are time-invariant. We first regress the houseprice level on the set of determinants by random effects and found out un-reliable estimators, since potential endogeneity has been ignored. However,applying the Mundlak approach captures this correlation between regressorsand individual effects and provides estimators (for time-varying variables)similar to those obtained by fixed effects. Since fixed effects estimators arenot affected at all by this correlation, obtaining similar estimators with theMundlak approach indicates that this methodology works quite well. In thishouse price model, we conclude that impacts (both directs and indirects) ofthe charm of the environment and of the quality and availability of privateservices on the house price levels are positive and significant while impactsof the quality of roads is not significant. We also observe positive and sig-nificant direct and indirect elasticities of house price with respect to incomeper capita, surface sold and density.

    References

    Anselin, L. (1988), Spatial Econometrics: Method and Models, Kluwer Aca-demic Publishers, London, England.

    Baltagi, B. H., Egger, P. & Pfaffermayr, M. (2009), A generalized spatialpanel data model with random effects, Working paper.

    Baltagi, B. H., Song, S. H., Jung, B. C. & Koh, W. (2007), Testing for serialcorrelation, spatial autocorrelation and random effects using panel data,Journal of Econometrics 140, 551.

    Baltagi, B. H., Song, S. H. & Koh, W. (2003), Testing panel data regressionmodels with spatial error correlation., Journal of Econometrics 117, 123150.

    Baltagi, B. H., Song, S. H. & Kwon, J. H. (2009b), Testing for heteroskedas-ticity and spatial correlation in a random effects panel data model, Com-putational Statistics and Data Analysis 53, 28972922.

    21

  • Beron, K., Hanson, Y., Murdooch, J. & Thayer, M. (2004), Hedonic pricefunctions and spatial dependence: implications for the demand for ur-ban air quality, in A. L., R. Florax & S. Rey, eds, Advances in spatialeconometrics: methodology, tools and applications, Springer, chapter 12,pp. 267316.

    Can, A. (1990), The measurement of neighborhood dynamics in urban houseprices, Economic Geography 66, 254272.

    Can, A. (1992), Specification and estimation of hedonic house price models,Regional Science and Urban Economics 22, 453474.

    Can, A. & Megbolugbe, I. (1997), Spatial dependence and house price indexconstruction, The Journal of Real Estate and Finance 14, 203222.

    Denil, F., Mignolet, M. & Mulquin, M.-E. (2004), Interregional differencesin taxes and population mobility, Working paper, University of Namur.

    Dubin, R., Pace, K. R. & Thibodeau, T. (1999), Spatial autoregressiontechniques for real estate data, Journal of Real Estate Literature 7, 7995.

    Elhorst, P. J. (2003), Specification and estimation of spatial panel datamodels, International Regional Science Review 26, 244268.

    Elhorst, P. J. (2010), Spatial panel data models, in M. Fischer & A. Getis,eds, Handbook of Applied Spatial Analysis, Springer-verlag, Berlin.

    Ertur, C. & Koch, W. (2007), Growth, technological interdependence andspatial externalities: Theory and evidence, Journal of Applied Economet-rics 22, 10331062.

    Ertur, C. & Koch, W. (2011), A contribution to the shumpeterian growththeory and empirics, Journal of Economic Growth, Forthcoming .

    Fingleton, B. (2008), Housing supply, housing demand, and affordability,Urban Studies 45, 15451563.

    Fingleton, B. (2010), Predicting the geography of house price, Technicalreport, London School of Economics.

    Hausman, J. (1978), Specification tests in econometrics, Econometrica46, 12511271.

    Hausman, Jerry, A. & Taylor, W. E. (1981), Panel data and unobservableindividual effects, Econometrica 49, 13771398.

    Kapoor, M., Kelejian, H. H. & Prucha, I. R. (2007), Panel data models withspatially correlated error components, Journal of Econometrics 140, 97130.

    22

  • Kelejian, H. H. & Prucha, I. R. (2010), Specification and estimation ofspatial autoregressive models with autoregressive and heteroskedastic dis-turbances, Journal of Econometrics 157, 5367.

    Lee, L.-F. & Yu, J. (2010a), Estimation of spatial autoregressive panel datamodels with fixed effects, Journal of Econometrics 154, 165185.

    Lee, L.-F. & Yu, J. (2010b), Some recent developments in spatial panel datamodels, Regional Science and Urban Economics 40, 255271.

    Lee, L.-F. & Yu, J. (2010c), A spatial dynamic panel data model with bothtime and individual fixed effects, Econometric Theory 26, 564597.

    Lee, L.-F. & Yu, J. (2010d), Spatial panels: random components vs. fixedeffects, Working paper.

    LeSage, J. & Pace, K. R. (2009), Introduction to Spatial Econometrics, CRCPress Taylor and Francis Group, New York.

    Mundlak, Y. (1978), On the pooling of time series and cross section data,Econometrica 46, 6985.

    Mutl, J. & Pfaffermayr, M. (2011), The hausman test in a cliff and ordpanel model, Econometrics Journal 14, 4876.

    Osland, L. (2010), An application of spatial econometrics in relation to he-donic house price modeling, The Journal of Real Estate Research 32, 289320.

    Pfaffermayr, M. (2009), Conditional beta and sigma convergence in space:A maximum likelihood approach, Regional Science and Urban Economics39, 6378.

    Rosen, S. (1974), Hedonic prices and implicit markets: product differentia-tion in pure competition, Journal of Political Economy 82, 3455.

    Tse, R. Y. C. (2002), Estimating neighborhood effects in house prices: to-wards a new hedonic model approach, Urban Studies 39, 11651180.

    23