Upload
others
View
4
Download
0
Embed Size (px)
Citation preview
Geography and inequality
John Östh
Uppsala University
Aims
• Focus on the blessings of integrating a economics with geography
• Point to some of the risks of not being aware of the role of geography
• Inspire to further studies
Presentation outline
• The nature of neighbourhoods and the measurement of contexts – Contextualization approaches
• predefined areas • Radii • K-nearest
– Other methods for retrieving geographical information • Proximity measures and spatial interpolation • Accessibility and Spatial interaction
– What about availability to data?
• EquiPop – Software – Modelling assumptions
• Segregation – integration
The nature of neighbourhoods and the measurement of contexts
• Neighbourhood and context often used interchangeably.
– When we are thinking about a neighbourhood we place humans in the middle (f.i. Perry, 1929)
– When we measure neighbourhood we (usually) refer to a single concept, representing a piece of land (See Lee, 1968; Galster 2001)
The nature of neighbourhoods and the measurement of contexts
x x
x x
x x
x x
x x
x x
1.
2.
1.
2.
1.
2.
The nature of neighbourhoods and the measurement of contexts
(as a result) contextual data can be – Self-contained and place-bound (fixed borders)
• Taxes, parking fees, i.e. often based on local regulations
• Data becomes unique to a location – difficult to compare between locations
• For the study of social processes, fixed areas are problematic (Sampson et al., 2002).
– Overlapping • Usually mobility based statistics
• Landscape of opportunities - local labour market, consumer and service areas, etc. (what if you assumed that workers only looked for jobs in their local area…)
The nature of neighbourhoods and the measurement of contexts
• Not giving the spatial containers of measurements any thoughts can lead to serious bias
– Here are a few examples
Example #1
Example #2
Municipalities and counties Counties
Two studies on early retirement indicated that 1) no spatial variation existed(county level) And 2) spatial variation existed (municipality level).
The nature of neighbourhoods and the measurement of contexts
• The latter example is related to the Modifiable Areal Unit Problem, MAUP (see for instance Openshaw, 1984; Wong, 2004; Andersson and Musterd, 2010)
– And also gerrymandering (hopefully less common)
The nature of neighbourhoods and the measurement of contexts
• Contextualization approaches
– Three approaches available
• Contextualization using predetermined areas
• Contextualization using radii
• Contextualization using k-nearest neighbour
– Which to use depends on question at hand but ask your self: What does the neighbourhood look like to the studied population?
The nature of neighbourhoods and the measurement of contexts
• Contextualization using predetermined areas – Non-overlapping
– Areas such as Wards, Tracts, Counties, Blocks, OA, SAMS…
– Usually hierarchical
– PLUS: easy to contextualize, very common, easy to communicate and map, hierarchical areas easy to use in multi-level modelling
– Minus: comparison over time and between areas difficult, MAUP, not placing humans in the middle. Created for other purposes. Boolean border problem
The nature of neighbourhoods and the measurement of contexts
The nature of neighbourhoods and the measurement of contexts
Population in Swedish SAMS 2008
Mean 1027,54
Median 716
Minimum 1
Maximum 20119
percentile 10 100
percentile 20 238
percentile 30 387
percentile 40 541
percentile 50 716
percentile 60 928
percentile 70 1196
percentile 80 1530
percentile 90 2077
The nature of neighbourhoods and the measurement of contexts
• Contextualization using radii – Overlapping
– Area is determined by distance from chosen center
– Multiple distances can be used to generate overlapping hierarchies and annuluses
– PLUS: relatively easy to use for the construction of contexts (I usually use Spatial Analysis Toolbox in ArcGIS), relatively common, easy to compare over time and between areas. Placing humans in the middle
– Minus: sensitive to variations in population distributions (test using point density measures)
– Boolean border problem (can be evaded using distance decay – but increases computation-complexity)
n
d
d
n
i
i
a
1
The nature of neighbourhoods and the measurement of contexts
Example of radii-based statistics
Distance Meaning
100m radii Home
200m radii Block
400m radii Greater block
800m radii Neighbourhood
The nature of neighbourhoods and the measurement of contexts
• Contextualization using k-nearest neighbour – Overlapping
– Area is variable and determined by the k-nearest neighbours
– Multiple k-levels can be used to depict differently sized neigbourhoods
– PLUS: Placing humans in the middle. Not sensitive to variations in population distributions, less sensitive to MAUP and border effects than the other two techniques. Suitable for comparison between areas and over time. Suitable for analysis of human processes
– Minus: usually very computer demanding and difficult to set-up. Disregards distance (may be evaded using distance decay formulations)
The nature of neighbourhoods and the measurement of contexts
Example of knn based statistics
The nature of neighbourhoods and the measurement of contexts
– Proximity measures and spatial interpolation
– Accessibility and Spatial interaction
– In order to discuss the above a short detour to how data and map fit together is necessary
Two kinds of data – Raster and Vector
OBJECTS
Using functions such as near, buffer, join, union or similar – points, lines and polygons can be interacted/intersected
Maps consists of layers
Matching data using GIS
• Using X and Y coordinates of any observed set of incidences – the locations of incidence can be matched to: – Underlying topography
– Features in proximity to locations
– The relationship to surrounding incidences
– …
• Matching can be conducted using several techniques:
Joining data spatially
• Matching spatially - selection
– Merging
And through attributes
• Matching with key variables - selection
– Merging
Measuring distance
Proximity measures are more than:
Though they are often the basis for our analyses.
22 ))()(( jijiij yyxxd
Accessibility
• Potential accessibility (Hansen 1959) can be formulated as:
• And is commonly used as (unconstrained):
• Also here, the spatial composition matters
Kilometers (decay prameter = 0,06931)
Distance 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33
ai = 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1
ai = 11 11 11
12,4165833
12,2224009
),(* ij
ji
ji dfiesOpportunita
ji
ijji dDa )exp(
I have seen spatial interaction models for South America stating that Brazil is doing better than expected…
Integration of economics, geography and accessibility
Work with A Reggiani and G Galiazzo, CEUS, 2015
Accessibility and distance
Network distance to hospital – straight line distance to hospital
Interpolation techniques
Where’s the data?
Spatially coded data on European level
• Example on sources of geo coded-data
– Corine
– Open-street
– Population grid
– GADM
– Inspire-sites
– Surprise ;)
Corine
Here (?)
OpenStreetMap
Population grid around 2 million squares are populated
Population Grid of EU & EquiPop
Population grid 2011 population as share of Max(2006,2011). K=40 000 nearest neighbours
Obvious between country comparison problems
GADM project
Inspire EU
Data may also be drawn from imagery
…and used in statistics
The surprise
The surprise #2
EquiPop
EquiPop
• EquiPop is a software-program developed for the calculation of k-nearest neighbourhoods/contexts. The software is specifically designed to work with datasets that contain thousands or millions of observations and offers viable solutions to Knn questions also where large areas and complex geographies are involved.
EquiPop
EquiPop
• Difference between conventional K-nearest models and EquiPop is the spatial arrangement of data.
Conventional model EquiPop
1. Sort matrix on distance from i to j
2. Collect values from k-nearest neighbours
3. When k has been reached, move to i+1 and redo.
1. Rectify the data to fit a predefined grid
2. Spatial relations in gridded space are predictable
3. Collect values from k-nearest neighbours using rule.
4. When k has been reached, move to i+1 and redo.
a. b.2 1 3 2 4 5 1 2 2
5 1 1 2 0 1 3 4 5
1 0 0 5 2 2 4 2 1 5
3 3 5 5 3 1 2 0 0 4
0 1 1 2 4 0 3 0 2 4
4 4 3 5 4 2 4 1 1 3
2 1 1 3 0 1 5 0 5 2
1 3 1 5 4 1 2 1 1
4 2 2 2 3 4 1 2 2
EquiPop
Simple layout • Get things in using
file-commands
• Chose what to be included and k-levels
• Start, batch, load and unzip
EquiPop can be used to create super-local patterns
TFR
Segregation
Measuring segregation
• Classic measure of segregation (probably the most widely used) is the index of Dissimilarity: D= (Massey and Denton 1988 is a widely spread text using D)
• Sensitive to MAUP, etc. – consider: – Population B = 9, W = 36 – scenario a:
– 9 regions (cells), w are spread equally, b are located in upper-right corner. D=0,88889
– 3 regions (Colours), same distribution. D = 0.66667
– Now pause and think – what will be the effect if we compare cities or regions over time? – what about scale?
Spatial measures of isolation
If there was no sorting The lines would have been flat!
Spatial measures of entropy
Entropy measures can also be employed But if the number of groups are more than 2 or if the populations are not equal in size Comparison becomes difficult (pop-weighted Shannon index, etc. may be employed, but the outcome becomes less intuitive in my opinion)
Max = ln(2) ~ 0.693
Over-time changes in Sweden
,00
,05
,10
,15
,20
,25
,30
10 100 1000 10000 100000
Visible minorities, SI (increasing segregation and increasing numbers)
Synliga minoriteter 2010 Synliga minoriteter 2002 Synliga minoriteter 1994
,00
,05
,10
,15
,20
,25
10 100 1000 10000 100000
SI, poverty (EU definition)
Fattigdom 2010 Fattigdom 2002 Fattigdom 1994
0
0,05
0,1
0,15
0,2
0,25
0,3
0,35
10 100 1000 10000 100000
SI, lower education, among individuals 19-64 years of age
Lågutbildade 2010 Lågutbildade 2002 Lågutbildade 1994
0
0,05
0,1
0,15
0,2
0,25
0,3
10 100 1000 10000 100000
SI, labour market inactivity (among 19-64 years of age)
Inaktivitet 2010 Inaktivitet 2002 Inaktivitet 1994
Thanks