23
1 STATISTICAL METHOD FOR ESTIMATING LAND VALUE USING A GEOGRAPHICALLY WEIGHTED REGRESSION MODEL Case Study in Bogota, Colombia Luis Francisco Pirabán Cruz May 2013 Abstract In the last 10 years significant changes have occurred in land uses in Bogota. These changes have been driven primarily by a significant population growth of 1.1 million in the last 10 years. Parallel to this population growth, Bogota in the last 10 years has developed Transmilenio, a high capacity BRT. More recently it conducted a complete reorganization of all local bus routes under 13 operational zones. All these changes have had a significant impact not only in land uses, but also in land values. Until now, limited studies have been conducted to understand what factors, and in what proportion, have been more influential in land values across the city. This paper analyzed factors that affect land values in Bogotá using a multi-linear regression of ordinary least squares (OLS) and then a geospatial analysis of these explanatory variables using geographically weighted regression (GWR). The results for the OLS explained 13% of the variance in land value in Bogota while the further analysis with spatial regression using GWR explained 60% for the variance in the model. Results of these models in Bogota showed that four explanatory variables have the greatest significance. These are distance to Transmilenio stations, proximity to main roads, proximity to secondary roads and population density. Geospatial results for the model, in where the geographic distribution of the weight of each variable can be seen, has shown an imaginary line that divides the city between north and south. It was also found that the model predicts the conditions that generates land value changes in Bogota. As a conclusion, the OLS analysis followed by a GWR in where the final prediction level was over 68% appears to be a valid alternative to identify key factors influencing land values in Bogota. Additionally, the geospatial analysis conducted showed consistency with historical conditions of development in Bogota. Future research on the topic in the area of data processing and variable identification is needed in order to understand at a smaller scale how land values behave.

STATISTICAL METHOD FOR ESTIMATING LAND VALUE USING A

  • Upload
    others

  • View
    16

  • Download
    0

Embed Size (px)

Citation preview

Page 1: STATISTICAL METHOD FOR ESTIMATING LAND VALUE USING A

1

STATISTICAL METHOD FOR ESTIMATING LAND VALUE USING A GEOGRAPHICALLY WEIGHTED REGRESSION MODEL

Case Study in Bogota, Colombia

Luis Francisco Pirabán Cruz May 2013

Abstract In the last 10 years significant changes have occurred in land uses in Bogota. These changes have been driven primarily by a significant population growth of 1.1 million in the last 10 years. Parallel to this population growth, Bogota in the last 10 years has developed Transmilenio, a high capacity BRT. More recently it conducted a complete reorganization of all local bus routes under 13 operational zones. All these changes have had a significant impact not only in land uses, but also in land values. Until now, limited studies have been conducted to understand what factors, and in what proportion, have been more influential in land values across the city. This paper analyzed factors that affect land values in Bogotá using a multi-linear regression of ordinary least squares (OLS) and then a geospatial analysis of these explanatory variables using geographically weighted regression (GWR). The results for the OLS explained 13% of the variance in land value in Bogota while the further analysis with spatial regression using GWR explained 60% for the variance in the model. Results of these models in Bogota showed that four explanatory variables have the greatest significance. These are distance to Transmilenio stations, proximity to main roads, proximity to secondary roads and population density. Geospatial results for the model, in where the geographic distribution of the weight of each variable can be seen, has shown an imaginary line that divides the city between north and south. It was also found that the model predicts the conditions that generates land value changes in Bogota. As a conclusion, the OLS analysis followed by a GWR in where the final prediction level was over 68% appears to be a valid alternative to identify key factors influencing land values in Bogota. Additionally, the geospatial analysis conducted showed consistency with historical conditions of development in Bogota. Future research on the topic in the area of data processing and variable identification is needed in order to understand at a smaller scale how land values behave.

Page 2: STATISTICAL METHOD FOR ESTIMATING LAND VALUE USING A

2

Introduction The issue of calculating the price of land in Bogotá has been complex (Muñoz Raskin, 2010). This is primarily due to its complexity in the interpretation and assignment of the variables that determine the value of a property in the city (Hongbo & Corinne, 2012). In last ten years significant changes have occurred in land uses in Bogota. Mainly these changes have been due to the significant increase in population that for the last ten years was over 1.1 million (Departamento Administrativo Nacional de Estadísticas, 2010). Parallel to this population growth, in the last fifteen years Bogota developed a high capacity BRT called Transmilenio and more recently conducted a complete reorganization of all bus routes under thirteen operational zones (Mendieta & Perdomo, 2007). All of these changes have had a significant impact not only in land uses, but also in land values. Today limited studies have been conducted to understand what factors explain land values in Bogota and in what proportion. This paper aims to analyze and explain the factors that affect land values in Bogota using a multi-linear regression using ordinary least squares (OLS) to observe most relevant variables and then a geospatial analysis of those explanatory variables using geographical weighted regression (GWR).

Background Today there is a lack on how entities such as the local cadaster system (Catastro Bogota1) determine land value in Bogota. Additionally there are not accurate databases or systems on land values that have identifeied the physical, social or economical characteristics explaining these values. To Our knowledge, even fewer have used tools of geographic information systems (GIS) for the calculation of land values in Bogota. Present studies in Colombia have focused mainly explain the relationship between the quality of air you breathe in Bogota and influence land prices (Carriazo, et al., 2013). Other studies, not in urban areas, have determined the relationship between illicit coca crops and land values (Rincón Ruiz, et al., 2013). These used as geospatial analysis methods which are seen as major factors and variances of the explanatory variables. A most recent study on the impact of Transmilenio in land use, land value and density (Bocarejo, Portilla, & Perez, 2013) explained why the influence of the

1 Catastro is the entity that keeps up the property and provides access to geographical and spatial information to support the decision-making in Bogotá.

Page 3: STATISTICAL METHOD FOR ESTIMATING LAND VALUE USING A

3

Transmilenio has had a strong impact on urban densification processes around Transmillenio stations. Moreover, the horizontal expansion of the city has resulted in the Transmilenio system coverage is not optimal Today geographic information systems (GIS) are been used to determine land values in the real estate sector. In this sector, attention has been on the hedonic pricing model, which reflect the functional relationships between the characteristics of prices soil and external factors such as proximity to main roads, and population density(Lochl & Axhausen, 2010) Additionally, the use of a GWR to validate the data and make the appropriate comparisons in land values have been demonstrated to provide insights hedonic pricing models (Duque, et al., 2011). Location is essential for determining housing prices. It could be noted that controlling for location and the spatial structure of markets is essential to explaining price differential and deriving accurate coefficient estimates in hedonic residential price models (Hongyan, et al., 2011). One common way to incorporate information about the location in hedonic models is to introduce distance to the central business district (CBD) (Páez Barajas & Currie, 2012) or sub-market indicators including regional, local, or neighborhood specific binary coded dummy variables or interaction terms into the regression equation (Nichols, et al., 2013). However, previous studies have revealed that inclusion does not necessarily take all of the spatial effect into account. In the analysis of land values, two types of spatial effects have been identified: spatial dependence and spatial heterogeneity(Lochl & Axhausen, 2010) and (Noresah & Ruslan, 2009). Spatial heterogeneity (or spatial non-stationarity) may be present when there is a lack of uniformity from the effects of space or the spatial units of observation are not homogeneous in space for that reason methods different from OLS are more accurate to estimate those variables (Leung, Mei, & Zhang, 2000).. Therefore, there may be spatial heteroscedasticity or spatially varying parameters present in this research and the methodology proposed is expected to address these behaviors. Commontly, housing markets frequently involve both spatial dependence and spatial heterogeneity due to localized supply and demand inequities. There is a list of three reasons why spatial dependence and heterogeneity should be considered jointly (Lochl & Axhausen, 2010). Studies using the method of ordinary least squares regression has been studied in Bogota to explain the relationship between urban form (geometric distribution) and how they are arranged the Transmilenio bus stations. This depending on such tracks is arranged in the city (Estupiñan & Rodriguez, 2008). Results from this study showed that residential land values in the properties located on the trunk of transmilenio tend to decrease its value and after about 3 blocks while if land use is commercial these properties increase in value as are closer to the bus stations.

Page 4: STATISTICAL METHOD FOR ESTIMATING LAND VALUE USING A

4

Ordinary Least Square Regression As a first step has linear regression by ordinary least squares. This regression model generates a prediction for the dependent variable, which in this case is the price of land in Bogota in terms of their relationships with a set of explanatory variables.

Equation 1 Ordinary Least Squares

Where, y = dependent variable to predict an = Coefficients xn = Explanatory variables e = Random error term or difference between the model and the observations The coefficients are calculated using the regression tool. They are values, one for each explanatory variable, representing strength and type of relationship of the explanatory variable with the dependent variable (Chapra & Canale, 2007). This type of regression is used to evaluate the relationships between two or more attributes. Identification and measurement of relationships to understand better what is happening in the calculation of the price of land (Estupiñan & Rodriguez, 2008). Besides this we begin to examine the causes that cause this much heterogeneity using explanatory variables. OLS is a needed starting point for all spatial regression analysis.

Figure 1 OLS Relationship between observed values and predicted values (ArcGIS, 2013)

R square: Multiple R and adjusted R squared statistics that are derived from the regression equation to quantify the performance of the model. The R squared value from 0 to 100 percent (ArcGIS, 2013). This method is effective and reliable only if the data and the regression model satisfy all the assumptions required by this method intrinsically percent (ArcGIS, 2013). Spatial data on various assumptions and requirements will cause OLS. Problems due to these assumptions can be:

Explanatory variables omitted Nonlinear relationships

Page 5: STATISTICAL METHOD FOR ESTIMATING LAND VALUE USING A

5

Data Outliers Non-stationarity. Refers to a variable has explanatory significance in certain

regions but insignificant in the other (Leung, et al., 2000). Correlation between variables. Refers to when the combination of more

than two explanatory variables in the model is redundant. Inconsistent in the residual variance. Refers to when the model predicts

well the small values of the variable but becomes unreliable for large values (Carriazo, et al., 2013)

The OLS regression results are reliable only if the data and the regression model satisfy all the assumptions previously mentioned and that this model requires inherently. Geographically Weighted Regression Geographically weighted regression (GWR) is a locally linear regression use to model relationships that vary spatially (Fotheringham, et al., 2002) .This type of regression builds a different equation for each entity in the dataset (for the case of this research transport Zones) that exist in Bogotá. This will incorporate the dependent variable and the explanatory variables that fall within the bandwidth of each target entity. After that proceeds to observe using OLS regression in order to exclude the dummy variables representing different spatial regimes as to include these in the model are created spatial correlation problems locally such as variables are redundant. The next figure shows an example of outcomes from a GWR.

Figure 2 Geographically Weighted Regression (ArcGIS, 2013)

The spatial context (the Gaussian kernel) is a function of a number of specific neighbors. Where the distribution of entities is dense, the spatial context is smaller, where the distribution is dispersed entities, the larger spatial context and further wherein the data must always be projected (ArcGIS, 2013).

Methodology The methodology used throughout the development of the model is divided into two parts. The first one is linear regression using ordinary least squares (OLS) that makes the program Microsoft Excel. The second part involves geographically

Page 6: STATISTICAL METHOD FOR ESTIMATING LAND VALUE USING A

6

weighted regression (GWR) that ultimately helped the level of prediction (R2 coefficient) . The methodology used examine and explore spatial relationships and spatial heteroscedasticity as they may help explaining factors behind the observed spatial variables. However, when modeling spatial relations, regression analysis can also be used for prediction (not applied in this thesis).

Figure 3 Spatial Analysis Methodology

Regarding the prediction model land prices in Bogota, the OLS regression required were several trials as in many attempts the R square coefficient and statistical validation procedures did not meet model needs. Different variables were taken into account in addition to those that created the final interaction of the model. These variables discarded from the final results were:

Distance to police department Distance to parks Variability in the stratum (Entropy Index) (Kanaroglou, 2008) Other land uses besides residential land use. E.g. Industrial land use and

unclassified land use Other explanatory variables

Because OLS regression by the method threw no significant R2 value (See Table 2) with the base map (transport areas), it took a new map based on which work after more than 12 try and error method. In addition to the transport areas base map is not allowed to run a GWR model in ArcGIS. The neighborhoods base map allowed us to predict with greater certainty the dependent variable it was possible to

Variable calculation

• Proceed to calculate all the required values for the model

OLS regression

model

• Combinations of the independent variables were processed using a trial and error methodology

GWR Model

• The more significant variables are included in the GRW model using ArcGIS 10.1

Results and spatial

Analysis

• Analyze the relationship between the explanatory variables and the dependent variable

Page 7: STATISTICAL METHOD FOR ESTIMATING LAND VALUE USING A

7

observe the variables that were most relevant to this model (see Table 4) after about 5 try and error.

Results: Case study in Bogota Explanatory Variables

The model was developed with variables which were provided by different sources, in ¡Error! No se encuentra el origen de la referencia. are shown the different variables used and the source of them.

Table 1 Explanatory Variables

Variable Description GIS Calculation Transmilenio Estimate the distance between

the centers of each polygon and the nearest Transmilenio station

Euclidean distance and zonal statistic where calculates statistics on values of a raster within the zones of another dataset.

Principal Roads Estimate the distance between the centers of each polygon and the nearest principal road station

Euclidean distance and zonal statistic where calculates statistics on values of a raster within the zones of another dataset.

Secondary Roads

Estimate the distance between the centers of each polygon and the nearest secondary road station

Euclidean distance and zonal statistic where calculates statistics on values of a raster within the zones of another dataset.

Distance to Bus Stop

Linear distance to the nearest bus stop

Distance calculated from the center of each polygon

Density Estimate density in Bogotá zones Calculates statistics on values of a raster within the zones of another dataset.

Security Index Estimate how dangerous is the area in relation to violent crime

Not Required

Police Station Estimate the distance between the centers of each polygon and the nearest police station

Shortest distance from polygon to line using kernel distance

Parks in Bogota Percentage of green zone of each polygon

Kernel distance

Stratum Each zone has an stratum associated

Sum of the weighted average of stratum per zone

Variability of the Stratum

Estimate that there is so much variability by area stratum

Sum of natural logarithm

Other land use Percentage of industrial and unclassified land use

Not required

*All of this dataset was provided by the Study Group in Urban and Rural Sustainability (SUR) of the Universidad de Los Andes (Grupo de Sostenibilidad & Universidad de los Andes, 2013). With this method it was possible to perform a series of tests to see which variables were more significant than others to predict the dependent variable and with this information proceed with the GWR.

Page 8: STATISTICAL METHOD FOR ESTIMATING LAND VALUE USING A

8

First there were several try and errors with the OLS regression method to identify significant variables in the model. Below are the results from the best combination of variables.

Table 2 Results of Multiple Regression Analysis with Transport Area as map base

Regressión Statistics

Multiple R 0.30469284

R square 0.09283773

Adjusted R square 0.08918473

Standard Error 430893.246

Observations 749

Coefficients Standard

Error t Statistic P-Value

Intercepción 648178.2184 34324.44918 18.8838637 2.7145E-

65 Mean Distance Principal roads -205.2647229 49.740906 -4.12667841

4.0958E-05

Mean Distance Transmilenio -57.9086563 14.41880065 -4.01619092

6.5142E-05

Density -5.346077437 1.129098472 -4.73481948 2.6247E-

06

This regression allowed noting that these three explanatory variables have a great significance for the model. After that a regression that included all variables, results were evident to show explanatory variables with T-Statistic 1.8 (Mendieta & Perdomo, 2007). In this way see which of them were the most significant and if the previously calculated variables there was collinearity Table 2.

Table 3 Results of Multiple Regression Analysis with All Explanatory Variables

Regression Statistics

Multiple R 0.35082218

R Square 0.1230762

Adjusted R Square 0.11432883

Standard Error 411715.201

Observations 811

Coefficients Standard Error t Statistic P-Value

Intercept 634316.193 51117.33676 12.4090227 1.8059E-32

Mean Distance Transmilenio -30.3464503 14.76369502 -2.055478 0.04015615

Mean Distance Principal roads -115.172799 59.05164411 -1.95037413 0.05147956

Density -7.18605092 1.081688518 -6.64336434 5.6558E-11

Mean Distance SITP -199.334336 92.94263984 -2.14470276 0.03227646

Page 9: STATISTICAL METHOD FOR ESTIMATING LAND VALUE USING A

9

Other Land Use -2158.35291 693.9153926 -3.1103978 0.00193447

Variability Stratum 39922.3178 27798.03648 1.43615603 0.15134788

Violent Deaths (Security Index) -1097.70193 674.4393276 -1.62757698 0.10400738

Mean Distance to Police Station 2185.6559 1117.983625 1.95499814 0.0509303

It was concluded that there is redundancy between the variables measuring the average distance between the tracks and bus stations. But noting that the model had better adjusted in relation to the actual data was not enough for the program to perform a GWR achieved. For this reason it was decided to change the base map Transport Areas by neighborhoods base map as allowed to have a more precise about the relationship between the dependent variable and the explanatory variables.

Table 4 Results of Multiple Regression Analysis with Neighborhoods Base Map

Regression Statistics

Multiple R 0.3765776

R Square 0.14181069

Adjusted R Square 0.13743217

Standard Error 306875.034

Observations 789

Coefficients Standard

Error t Statistic P-Value

Intercept 585778.836 23586.87372 24.83495029 6.723E-101 Mean Distance Transmilenio -39.3693116 9.671489766 -4.070656392 5.1635E-05 Mean Distance Principal roads -139.938635 29.28110076 -4.779145291 2.1019E-06 Mean Distance Secondary roads -121.845235 127.9967559 -0.951940021 0.34142078

Density -4.54650615 0.650496842 -6.98928243 5.9121E-12

Finally with this regression was achieved, the GWR model was processed. This means that the explanatory variables predict the independent variable. Tests on the data set established reduced concern about multi-colinearity in the variables. Table 5 shows the autocorrelation shows the correlation between the variables analyzed.

Table 5 Autocorrelation

Land Value

m2 Transmilenio

Principal Roads

Secundary Roads

Density

Land Value m2 1

Transmilenio -0,240 1

Principal Roads -0,269 0,500 1

Page 10: STATISTICAL METHOD FOR ESTIMATING LAND VALUE USING A

10

Secundary Roads -0,122 0,386 0,477 1

Density -0,192 -0,127 -0,099 -0,291 1

This explains that there is an inverse relationship between the explanatory variables and the land value. In conclusion, there is a stronger dependence of the explanatory variables to be seen as among the closest to 1 its correlation to predict the variable will be higher.

Results Geographically Weighted Regression In Table 6 the Adjusted R2 was 0.60 suggesting an improvement on the non-spatial multiple regression model with the results give in database on Table 4

Table 6 Results Geographically Weighted Regression Model

NAME VALUE

Neighbors 95

ResidualSquares 28762284952500

EffectiveNumber 119,136247

Sigma 207991,39321

AICc 21509,259517

R2 0,66434

R2Adjusted 0,604698

GWR method improved R2 coefficient from 13% to 60%. This could be explained by the fact that GWR takes "neighbors" and with these constructs to separate equation for every feature in the dataset. Individual equations incorporate the dependent and explanatory variables of features falling within the bandwidth of each target feature (See Table 6 for specific results of the GWR model in this research). At the same time, it was necessary to eliminate variables which generate no significance in the model, on the contrary generate multi collinearity which can seriously affect the R2 and yield results that are not valid. To improve the R2 in this model, it was excluded those areas that had a standard deviation greater than 2.5 since these are considered as outliers and decrease means that the reliability model to reality. The spatial aspects of the top three explanatory variables were explored using geographically weighted regression. Outputs mapped the spatial strength in explaining land value in Bogota. In the analysis of results takes into account income (represented by the variable estrato). This is an implicit variable in the price of land, as it is determined largely by explanatory variables such as density, accessibility to the sector, etc. This in relation to the analysis serves to validate the data obtained from the GWR and confirm that these indeed have a significant relationship with reality.

Page 11: STATISTICAL METHOD FOR ESTIMATING LAND VALUE USING A

11

It is important to note that the geographic datasets used to build the GWR model had several limitations, which were:

The geographical unit in which the explanatory had to be projected to a planar coordinate system. This could trigger inaccuracies in the prediction of the dependent variable in this case is the value of land in Bogota (Fotheringham & Charlton, 2009).

The difference in years (Yiorgos, 2004) of the explanatory variables. This is mainly because the city does not have a consolidated database and updated to obtain all the data from the same year.

For purposes of this analysis areas with an standard deviations of over 2.5, in the outer areas were remove as they affected the performance of GWR due to their lack of appropriate neighbors and local knowledge by the author. Analysis of results

Comparisons of two types can be made using results of the GWR model. The first neighborhoods which have a high income, that means land value highest on record in Bogota. Furthermore there are neighborhoods low income, in which the people with the lowest income in Bogota lives.

Page 12: STATISTICAL METHOD FOR ESTIMATING LAND VALUE USING A

12

Figure 4 Spatial Relationships between Transmilenio and Land Value in Bogota

The supply of public transport in this case Transmilenio bus stop has a strongest link with land value in Bogota. Figure 3 shows that the land value is higher in the Chapinero zone, mainly due to their close proximity to the Transmilenio bus stops. This mainly is influenced further than the central business district of Bogota is located in this area it has a high land value. Moreover, in the southeast and southwest of the city it seens that land values are much lower. Low income residents areas, the land values are a direct relationship to the proximity to the Transmilenio stations.

Page 13: STATISTICAL METHOD FOR ESTIMATING LAND VALUE USING A

13

Figure 5 Hot Spot Analysis of Proximity to Transmilenio Bus Stop Station

The influence of the Transmilenio is very important as it can be seen as nearby bus stops these systems have a direct impact on the price of land. As one moves away from the stations, standard deviations tend to be lower. Figure 4 shows that the data are close to the standard deviation is the central-east (CBD) in which are located the most valuable land in the city and are purely for commercial use

Page 14: STATISTICAL METHOD FOR ESTIMATING LAND VALUE USING A

14

Figure 6 Spatial Relationships between Principal Roads and Land Value in Bogota

This map shows how the neighborhoods with greater accessibility in relation to major roadways (downtown) have positive coefficients, meaning that they have a direct relationship with the land value. However, there is an area next to downtown which shows an inverse relationship to the land value. This is mainly due to the proximity to the main roads that are a cause of residential land use increased to a certain point and then decreases because it is on these. While for commercial land use land values increases if they are on the main roads. Turn can be seen in the eastern part of the city where there are no main roads land values is severely punished in relation to other areas where at least one channel can be considered through main

Page 15: STATISTICAL METHOD FOR ESTIMATING LAND VALUE USING A

15

On the other hand the neighborhoods deemed low income does not have a strong variance based on the results from the mode since the principal roads have great coverage.

Figure 7 Hot Spot Analysis of Proximity to Principal Roads

Predicting land value in Bogota standard deviation whose data are closest to the mean are presented throughout the downtown area and extended city center. This has a direct bearing on the main roads as the accessibility to these neighborhoods are surrounded by major roads and handle higher traffic in the city. Moreover

Page 16: STATISTICAL METHOD FOR ESTIMATING LAND VALUE USING A

16

neighborhoods whose accessibility in relation to main roads is less data are further away from the mean in the normal distribution. This is explained Figure 6 as the land value is directly proportional in some city neighborhoods and other neighborhoods is inversely proportional. Hence, there are negative coefficients.

Figure 8 Secondary Roads Spatial Coefficients

The land value in relation to secondary roads has a direct impact on the vast majority of city neighborhoods is seen as yet in the southwestern part this relationship tends to be negative. This is explained by the conditions under which lies the road network is abysmal (Bitter, et al., 2007).

Page 17: STATISTICAL METHOD FOR ESTIMATING LAND VALUE USING A

17

Moreover, the city center has a very good condition of the secondary road network as they connect to the main roads and therefore accessibility to these neighborhoods is greater. By having more and better accessibility to neighborhoods residential land value increases

Figure 9 Hot Spot Analysis of Proximity to Transmilenio Bus Stop Station

The standard deviation of the coefficients of secondary roads show explained in Figure 9 which exists as minor roads which have an acceptable road network and are readily accessible to neighborhoods to better predict the dependent variable for this if land value in Bogota. This corresponds to the downtown area and west of the city downtown.

Page 18: STATISTICAL METHOD FOR ESTIMATING LAND VALUE USING A

18

Figure 10 Geographic Distributions – Density Coefficient

Figure 10 shows the inverse relationship between the demographic distribution of the population living in Bogota and land prices. This explains the population living in areas of high strata (stratum 6) and the population living in areas of lower strata (stratum 1) in relation to low and high population density respectively with land value. This is evidenced largely that people with the highest incomes, do not tend to live in places where population density is high. It is also noted as the most densely populated areas of the city are located south of this and land prices are strictly lower compared to the rest of the city.

Page 19: STATISTICAL METHOD FOR ESTIMATING LAND VALUE USING A

19

Figure 11 Hot Spot Analysis of Proximity to Transmilenio Bus Stop Station

This map supports the explanation given in Figure 3 in which areas with higher population density are susceptible to the price of land in that area is less. By having greater density tends standard deviation away from the mean in more normal distribution so that the calculation of the land value in these areas tends to be inaccurate.

Page 20: STATISTICAL METHOD FOR ESTIMATING LAND VALUE USING A

20

Discussion and Conclusions In this paper was possible to develop a spatial regression model using one of several methods of regression. In this case a GWR analysis was used to calculate the land value in Bogota thus allowing to observe which explanatory variables are significantly influencing land values in Bogota. The four variables that have the greatest significance in explaining land prices are close to Bogota Transmilenio stations (-39.37) access to the area in terms of the principal roads (-139.94), access to the neighborhood in terms of secondary roads (-121.85) and the population distribution in terms of the density per Km2 (-4.55) GWR significantly improved the prediction. Beginning with an OLS regression that yielded an adjusted R2 of 14% and then using the aforementioned method which yielded an R2 of 60%. That means that 60% explained variance of the data between reality and the model. This showed that the prediction is not uniform along the entire city. Through this reason is that each neighborhood located in town has a different value in the coefficients of the explanatory variables and with this a different set R2 (Fotheringham, et al., 2002). Bogota is a city too segregated and this can be demonstrated in the analysis of the four explanatory variables in which higher land prices, are concentrated in the northeast where it is located around the CBD (Du & C, 2006) and (Bocarejo, et al., 2013). The behavior of the value of land in the rest of the city is heterogeneous with no clear pattern established as seen in maps is a direct relationship between socioeconomic strata. On the contrary there is no homogeneous situation for understanding directly how they relate to the explanatory variables in a general environment such as Bogota, but to understand the behavior of the value of land in the city should evaluate areas and finally conclude that the difference with the other. The main roads and secondary roads represent the level of accessibility that exists in the city. In Bogota is seen as the sectors with greater accessibility help better predict the explanatory variable. However there is a tendency with the main roads very similar to what happens with the Transmilenio stations in that there is a point at which the proximity to these negatively influences residential land value while commercial land value tends to increase (Muñoz Raskin, 2010). This can be explained due to high levels of pollution on main roads in reference to adjacent area (Carriazo, et al., 2013). On the other hand it can be concluded the negative secondary roads land value when they are in unfavorable walkability. This explains why there are negative coefficients (some found in marginal areas of the periphery of the city) and others found in the west of the city center where it fails to predict the dependent variable. This methodology used presents an approach to land values of Bogota, however it is only a starting point in analyzing this, because the data show a variation in years, so cannot be considered as last word on this analysis results.

Page 21: STATISTICAL METHOD FOR ESTIMATING LAND VALUE USING A

21

Other studies to explain the relationship of BRT in Bogota (Rodriguez & Mojica, 2008) reach conclusions that can be observed in this investigation. The neighborhoods where land value is higher are influenced negatively with increasing proximity to the Transmilenio stations. The most decisive factor is the socioeconomic status which is the neighborhood and this can be seen in the CBD which is northeast of the city and shares a portion with the highest stratum 6 land value that exists in Bogota. As seen in the north-south divide that exists in the city in which the land north are more expensive and less density while in the south is home to the highest population density, major roads in a lower percentage Transmilenio and much lower coverage. This methodology is subject to improvements that you know with more certainty the price of land in Bogota. Having an updated database of all properties that exist in Bogota variance significantly improve the model as explanatory variables allow the better fit this. That is why this model serves as a starting point for future research relevant and which could be replicated in all Colombian cities. In the same sense this model is susceptible to changes that allow better matching, adding variables for this case were not taken into account due to the limited database that exists in the city with respect to land value in this. The explanation of the model to be presented below may allow in the future, have an important application as it will affect the price of land to increase the features that give this value, such as main and secondary roads, proximity to stations public service and density. The study can be even more accurate if there are as main roads only pathways which concentrate the most traffic in Bogota and also considering. The model does not represent the cost tightly that there are many additional variables that are not given but for which there is no current information on which to be certain that the results to better predicts land value in Bogota.

References

Bitter, C., Mulligan, G. F. & Dall'erba, S., 2007. Incorporating spatial variation in housing attribute prices: a comparison of geographically weighted regression and the spatial expansion method. Journal of Geographical Systems, IX(I), pp. 7-27.

Bocarejo, J. P., Portilla, I. & Perez, M. A., 2013. Impact of Transmilenio on density, land use, and land value in Bogotá. Research in Transportation Economics, XL(I), pp. 78-86.

Carriazo, F., Ready, R. & Shortle, J., 2013. Using stochastic frontier models to mitigate omitted variable bias in hedonic pricing models: A case study for air quality in Bogotá, Colombia. Ecological Economics, Volume XCI, pp. 80-88.

Chapra, S. C. & Canale, R. P., 2007. Métodos numéricos para ingenieros. Quinta Edición ed. México D. F.: McGraw-Hill Interamericana.

Departamento Administrativo Nacional de Estadísticas, D., 2010. General Census Bulletin 2005, Bogota: DANE.

Page 22: STATISTICAL METHOD FOR ESTIMATING LAND VALUE USING A

22

Du, H. & C, M., 2006. Relationship between transport accessibility and land value: Local model approach with geographically weighted regression. Transportation Research Record, Volume I, pp. 197-205.

Duque, J. C., Velásquez, H. & Agudelo , J., 2011. Public infrastructure and housing prices: An application of geographically weighted regression within the context of hedonic prices. Medellin: EAFIT.

Estupiñan, N. & Rodriguez, D. A., 2008. The relationship between urban form and station boardings for Bogotá’s BRT. Transportation Research Part A: Policy and Practice, XLII(II), pp. 296-306.

Fotheringham, A. S., Brunsdon, C. & Charlton, M. E., 2002. Geographically Weighted Regression: The Analysis of Spatial Varying Relationships. Chichester: Wiley.

Fotheringham, A. S. & Charlton, M. E., 2009. Geographically Weighted Regression: White Paper. Maynooth: National University of Ireland Maynooth.

Grupo de Sostenibilidad, D. U. y. R. (. & Universidad de los Andes, 2013. Explanatory Variables Database, Bogota: s.n.

Hongbo, D. & Corinne, M., 2012. Understanding Spatial Variations in the Impact of Accessibility on Land Value Using Geographically Weighted Regression. The Journal of Transport and Land Use, Volume V, pp. 46-59.

Hongyan, D., Yongkai, M. & Yunbi, A., 2011. The Impact of Land Policy on the Relation Between Housing and Land Prices. Evidence from. The Quarterly, Volume LI, pp. 19-27.

Kanaroglou, P. &., 2008. Entropy Index. s.l.:s.n. Leung, Y., Mei, C.-L. & Zhang, W. X., 2000. Statistical tests for spatial nonstationarity

based on the geographically weighted regression model. Environment and Planning, XXXII(I), pp. 9-32.

Lochl, M. & Axhausen, K. W., 2010. Modeling hedonic residential rents for land use and transport. The Journal of Transport and Land Use, p. 39 63.

Mendieta, J. C. & Perdomo, J. A., 2007. Especificación y estimación de un modelo de precios hedónico espacial para evaluar el impacto de Transmilenio sobre el valor de la propiedad en Bogotá. Universidad de los Andes, pp. 1-44.

Muñoz Raskin, R., 2010. Walking accessibility to bus rapid transit: Does it affect property values? The case of Bogotá, Colombia. Transport Policy, XVII(II), pp. 72-84.

Nichols, J. B., Oliner, S. D. & Mulhall , M., 2013. Swings in Commercial Land and Residential Land Prices in the United States. Journal of Urban Economics, Volume I, pp. 57-76.

Noresah, M. S. & Ruslan, R., 2009. Modellin urban spatial structure using Geographically Weighted Regression. Acta Press, pp. 1950-1956.

Páez Barajas, D. E., 2013. Sicuaplus Universidad de los Andes. [Online] Available at: sicuaplus.uniandes.edu.co [Accessed February 2013].

Páez Barajas, D. E. & Currie, G., 2012. Key Factors Affecting Journey to Work in Melbourne using Geographically Weighted Regression. Transportation Research Record, pp. 1 - 12.

Paez, A. & Wheeler, D. C., 2009. Geographically Weighted Regression. International Encyclopedia of Geography, pp. 407-414.

Regression, G. W., 2012. ArcGIS Resource Center. [Online] Available at: http://help.arcgis.com/es/arcgisdesktop/10.0/help/index.html#//005p00000021000000 [Accessed 18 May 2013].

Page 23: STATISTICAL METHOD FOR ESTIMATING LAND VALUE USING A

23

Rincón Ruiz, A., Pascual, U. & Flantua, S., 2013. Examining spatially varying relationships between coca crops and associated factors in Colombia, using geographically weight regression. Applied Geography, Volume XXXVII, pp. 23-33.

Rodriguez, D. & Mojica, C., 2008. and value impacts of bus rapid transit. The case of Bogotá’s TransMilenio. Lincoln Institute of Land Policy. .

Yiorgos, M. K., 2004. DEFINING A GEOGRAPHICALLY WEIGHTED REGRESSION MODEL OF URBAN EVOLUTION. APPLICATION TO THE CITY OF VOLOS, GREECE.. Department of Urban Plannign and Regional Development.