43
Spatial Analysis I Spatial data analysis Spatial analysis and inference

Spatial Analysis I - UBC Blogsblogs.ubc.ca/advancedgis/files/2015/11/Lecture07SpatialAnalysisI.pdfSpatial analysis Turns raw data into useful information by adding greater informative

  • Upload
    vanmien

  • View
    219

  • Download
    2

Embed Size (px)

Citation preview

Page 1: Spatial Analysis I - UBC Blogsblogs.ubc.ca/advancedgis/files/2015/11/Lecture07SpatialAnalysisI.pdfSpatial analysis Turns raw data into useful information by adding greater informative

Spatial Analysis ISpatial data analysisSpatial analysis and inference

Page 2: Spatial Analysis I - UBC Blogsblogs.ubc.ca/advancedgis/files/2015/11/Lecture07SpatialAnalysisI.pdfSpatial analysis Turns raw data into useful information by adding greater informative

RoadmapOutline:

What is spatial analysis?Spatial JoinsStep 1: Analysis of attributesStep 2: Preparing for analyses: working with distanceStep 3: Spatial patterns analysisStep 4: Kernel density analysisSummary

Page 3: Spatial Analysis I - UBC Blogsblogs.ubc.ca/advancedgis/files/2015/11/Lecture07SpatialAnalysisI.pdfSpatial analysis Turns raw data into useful information by adding greater informative

Aspatial vs spatial analysis

Difference of

Aspatial analyses assume that where you take your sample shouldn’t matter.

Page 4: Spatial Analysis I - UBC Blogsblogs.ubc.ca/advancedgis/files/2015/11/Lecture07SpatialAnalysisI.pdfSpatial analysis Turns raw data into useful information by adding greater informative

Spatial analysisTurns raw data into useful information

by adding greater informative content and value

Reveals patterns, trends, and anomalies that might otherwise be missed

Provides a check on human intuitionby helping in situations where the eye might deceive

Page 5: Spatial Analysis I - UBC Blogsblogs.ubc.ca/advancedgis/files/2015/11/Lecture07SpatialAnalysisI.pdfSpatial analysis Turns raw data into useful information by adding greater informative

Spatial analysisSpatial analysis can be

inductive, to examine empirical evidence in the search for patterns that might support new theories or general principles, as with disease mapping (cancer maps)deductive, focusing on the testing of known theories or principles against data (Sky Train stations as centres of criminal activity);normative, using spatial analysis to develop or prescribe new or better designs (geodesign).

Page 6: Spatial Analysis I - UBC Blogsblogs.ubc.ca/advancedgis/files/2015/11/Lecture07SpatialAnalysisI.pdfSpatial analysis Turns raw data into useful information by adding greater informative

Spatial analysisA method of analysis is spatial if the results depend on the locations of the objects being analyzed

move the objects / study boundaries and the results changeresults are not invariant under relocation

Spatial analysis uses the locations of objects, and, most often, the attributes of those objects

Spatial analysis is the crux of GIS Attribute linkages

Spatial data

P,L,A,Px

Attribute data

NOIR

Page 7: Spatial Analysis I - UBC Blogsblogs.ubc.ca/advancedgis/files/2015/11/Lecture07SpatialAnalysisI.pdfSpatial analysis Turns raw data into useful information by adding greater informative

Getting organized: JoinsOne of the more powerful features of a GIS is the ability to join

attribute tables to spatial layers based on a common geographic location ID (such as the CTUID).

ArcMap also has many different forms of spatial joins.(3-D joins and another reason not to use unprojected data

[scroll down to ‘Getting the Best Result’])

Page 8: Spatial Analysis I - UBC Blogsblogs.ubc.ca/advancedgis/files/2015/11/Lecture07SpatialAnalysisI.pdfSpatial analysis Turns raw data into useful information by adding greater informative

Step 1: Analysis of attributesAttribute table joins

Scatterplots

Other types of plots?

Regression

Looking for outliersor other unusual patternsin the attribute data

Problem? Spatial heterogeneity

Know your data

Page 9: Spatial Analysis I - UBC Blogsblogs.ubc.ca/advancedgis/files/2015/11/Lecture07SpatialAnalysisI.pdfSpatial analysis Turns raw data into useful information by adding greater informative

Step 2a: Preparing for analysis: getting our distances correct

Pythagorean or straight-line metric Shortest distance on a sphere? (Which route?)

Distance along a route represented in a GIS (a polyline) is often calculated by summing the lengths of each segment of the polyline

Because there is a general tendency for polylines to short-cut corners, the length of a polyline tends to be shorter than the length of the object it represents.Length of a 3 dimensional line measured off its planimetric representation

will also be shorter than its true lengthUnless you are working at very long distances (e.g., continental), work only

with projected data (e.g., m)

Page 10: Spatial Analysis I - UBC Blogsblogs.ubc.ca/advancedgis/files/2015/11/Lecture07SpatialAnalysisI.pdfSpatial analysis Turns raw data into useful information by adding greater informative

Pythagoras’s Theorem and the straight-line distance between two points on a plane. What is the length of D?

Page 11: Spatial Analysis I - UBC Blogsblogs.ubc.ca/advancedgis/files/2015/11/Lecture07SpatialAnalysisI.pdfSpatial analysis Turns raw data into useful information by adding greater informative

The effects of the Earth’s curvature on the measurement of distance, and the choice of shortest paths Geodesics

Page 12: Spatial Analysis I - UBC Blogsblogs.ubc.ca/advancedgis/files/2015/11/Lecture07SpatialAnalysisI.pdfSpatial analysis Turns raw data into useful information by adding greater informative

The length of a path as traveled on the Earth’s surface (red line) may be substantially longer than the length of its horizontal projection as evaluated in a two-dimensional GIS

In the figure are shown three paths across part of Dorset in the UK. The green path is the straight route (‘as the crow flies’), the red path is the modern road system, and the gray path represents the route followed by the road in 1886

(Courtesy Michael De Smith)

Page 13: Spatial Analysis I - UBC Blogsblogs.ubc.ca/advancedgis/files/2015/11/Lecture07SpatialAnalysisI.pdfSpatial analysis Turns raw data into useful information by adding greater informative

The vertical profiles of all three routes, with elevation plotted against the distance

traveled horizontally in each case.

1 ft = 0.3048 m, 1 yd = 0.9144 m.

(Courtesy Michael De Smith)

Page 14: Spatial Analysis I - UBC Blogsblogs.ubc.ca/advancedgis/files/2015/11/Lecture07SpatialAnalysisI.pdfSpatial analysis Turns raw data into useful information by adding greater informative

Question: how to determine the true (3D) length of a line?This used to be a complex process, but you can now achieve this result in two easy steps.

You need to have a linear feature and a DEM or TIN.

Use the Interpolate Shape Tool (3D Analyst) to add the Z values to the line

Use the Add Z Information Tool (3D Analyst) to add fields to the linear feature’s attribute table.

Page 15: Spatial Analysis I - UBC Blogsblogs.ubc.ca/advancedgis/files/2015/11/Lecture07SpatialAnalysisI.pdfSpatial analysis Turns raw data into useful information by adding greater informative

Buffers (dilations) of constant width drawn around a point, a polyline, and a polygon

0 Buffering is a commonly applied distance-based analysis

Identifying areas of influence:Buffers

Page 16: Spatial Analysis I - UBC Blogsblogs.ubc.ca/advancedgis/files/2015/11/Lecture07SpatialAnalysisI.pdfSpatial analysis Turns raw data into useful information by adding greater informative

Buffers representing 1⁄2-mile exclusion zones around all schools in part of Los Angeles

Page 17: Spatial Analysis I - UBC Blogsblogs.ubc.ca/advancedgis/files/2015/11/Lecture07SpatialAnalysisI.pdfSpatial analysis Turns raw data into useful information by adding greater informative

Step 2a: Preparing for analysis: getting our neighbours correct

0 Many spatial techniques require informative data on spatial relationships (usually 1 to n values).

0 How to formally define the spatial relationships between points, polygons or grids on the surface of analysis?

0 We would like to quantify ‘nearness’ in some fashion.0 How do we want to quantify that nearness (distance,

adjacency)?0 Many approaches require a weight matrix. 0 Matrices function like maps that guide our analysis.

Page 18: Spatial Analysis I - UBC Blogsblogs.ubc.ca/advancedgis/files/2015/11/Lecture07SpatialAnalysisI.pdfSpatial analysis Turns raw data into useful information by adding greater informative

Weight MatricesWe can use different types of weight matrices to see if there are different types of spatial relationships.

Two broad types of matrices: Distance-based (obviously useful for point features, but also used for polygonal features; usually a cut-off distance is defined [e.g., distance between < 1000m]). Can also use a network to determine the distance.Contiguity-based (a key attribute of polygonal features—do they share a common edge?)ArcMap’s help file for generating spatial weights.

Page 19: Spatial Analysis I - UBC Blogsblogs.ubc.ca/advancedgis/files/2015/11/Lecture07SpatialAnalysisI.pdfSpatial analysis Turns raw data into useful information by adding greater informative

Weight MatricesDistance

Distance-based creates bands around the points (perhaps 1000m) (points or centroids in polygon) to ID neighbours.K-Nearest Neighbor counts or ‘marks’ the k closest neighbors (a relative distance measure, since in some areas the k points may be very close while in other areas the k points may be much further away) (think of k points or centroids of polygons; or the k adjacent polygons)

K=3

Page 20: Spatial Analysis I - UBC Blogsblogs.ubc.ca/advancedgis/files/2015/11/Lecture07SpatialAnalysisI.pdfSpatial analysis Turns raw data into useful information by adding greater informative
Page 21: Spatial Analysis I - UBC Blogsblogs.ubc.ca/advancedgis/files/2015/11/Lecture07SpatialAnalysisI.pdfSpatial analysis Turns raw data into useful information by adding greater informative

Weight MatricesContiguity-based weights

Rook--counts only edge adjacenciesQueen--counts edges and vertices For rasters, very easy to visualize.For polygons, the resulting size of the neighbourhoods can vary widely

Rook Queen

Page 22: Spatial Analysis I - UBC Blogsblogs.ubc.ca/advancedgis/files/2015/11/Lecture07SpatialAnalysisI.pdfSpatial analysis Turns raw data into useful information by adding greater informative

Weight MatricesAn example of a weight matrix for polygons where a vertex is not counted as adjacent (2 is not adjacent to 6)

Note that polygons are not considered adjacent to themselves.

1 – adjacent

0 – not adjacent

1

2

3 4

5

6

1 2 3 4 5 6

1 0 1 0 0 1 0

2 1 0 1 1 1 0

3 0 1 0 1 0 0

4 0 1 1 0 0 1

5 1 1 0 0 0 1

6 0 0 0 1 1 0

Page 23: Spatial Analysis I - UBC Blogsblogs.ubc.ca/advancedgis/files/2015/11/Lecture07SpatialAnalysisI.pdfSpatial analysis Turns raw data into useful information by adding greater informative

Weight Matrices

A weight matrix is often contained in a weight’s file. We can establish such files in ArcGIS and in programs like GeoDa.

For example, in GeoDa weights files include:.gal for contiguity-based weights.gwt for distance-based weights

In other programs weights might be defined in the GUI rather than through a separate file.

Page 24: Spatial Analysis I - UBC Blogsblogs.ubc.ca/advancedgis/files/2015/11/Lecture07SpatialAnalysisI.pdfSpatial analysis Turns raw data into useful information by adding greater informative

Step 3: Spatial patterns’ analysis

Identification of how objects cluster is often important in many different fields:

ArchaeologyCriminologyEcologyEpidemiology

Points patterns can be identified as clustered, dispersed, or randomKinds of processes responsible for point patterns are:

First-order processes involve points being located independently (rain drops)Second-order processes involve

interaction between points (acorns from oaks)

The K function is an example of a descriptive statistic of patternLooking at the distribution of spatial

objects without considering their attributes.

Page 25: Spatial Analysis I - UBC Blogsblogs.ubc.ca/advancedgis/files/2015/11/Lecture07SpatialAnalysisI.pdfSpatial analysis Turns raw data into useful information by adding greater informative

Point pattern of individual tree locations. A, B, and C identify the individual trees analyzed in the following graphs

Here the points represent trees, but they could represent crime incidents, locations of people with a disease, store locations, etc.

(Source: Getis A. and Franklin J. 1987. Second-order neighborhood analysis of mapped point patterns. Ecology 68(3): 473–477).

Point pattern analysis

Page 26: Spatial Analysis I - UBC Blogsblogs.ubc.ca/advancedgis/files/2015/11/Lecture07SpatialAnalysisI.pdfSpatial analysis Turns raw data into useful information by adding greater informative

Point pattern analysis: Ripley’s K Summarizes spatial autocorrelation (point feature clustering or feature dispersion) over a range of distances.

This is used when you want to see how changing spatial distances impact nearest neighbour counts. It can help identify an appropriate window size.

In many pattern analysis studies, the selection of an appropriate scale of analysis is required. For example, a distance threshold or distance band is often needed for the analysis (e.g., kernel density analysis). When exploring spatial patterns at multiple distances and spatial scales, patterns change, often reflecting the dominance of particular spatial processes at work. Ripley's K function illustrates how the spatial clustering or dispersion of feature centroids changes when the neighborhood size changes.A local measure but on that can look at all distances.

ESRI’s description of Ripleyy’s K

Page 27: Spatial Analysis I - UBC Blogsblogs.ubc.ca/advancedgis/files/2015/11/Lecture07SpatialAnalysisI.pdfSpatial analysis Turns raw data into useful information by adding greater informative

(Source: Getis A. and Franklin J. 1987. Second-order neighborhood analysis of mapped point patterns. Ecology 68(3): 473–477).

Clustered

Overdispersion

Ripley's K Function

A- area (e.g., bounding box)N - # of ptsd – distance (classes)

k(i, j) is the weight, which is 1when the distance between i and j

is less than or equal to d and 0 when the distance between i and j is greater than d.

With the L(d) transformation, the expected value is equal to distance

What doesoverdispersionmean?

Page 28: Spatial Analysis I - UBC Blogsblogs.ubc.ca/advancedgis/files/2015/11/Lecture07SpatialAnalysisI.pdfSpatial analysis Turns raw data into useful information by adding greater informative

Pine trees are represented by green dots and other tree species are represented by red dots. The function counts the number of neighboring pine trees found within a given distance from each individual pine tree (Xm).

The number of observed neighboring pine trees is then traditionally compared to the number of pine trees one would expect to find based on a completely spatially random point pattern.

If the number of pines found within a given distance of each individual pine is greater than that for a random distribution, the distribution is clustered. If the number is smaller, the distribution is dispersed.

Page 29: Spatial Analysis I - UBC Blogsblogs.ubc.ca/advancedgis/files/2015/11/Lecture07SpatialAnalysisI.pdfSpatial analysis Turns raw data into useful information by adding greater informative

Permutations (over and over again)Spatial Autocorrelation measures such as Ripley’s K or Moran’s I usually compare your data to a theoretical random data set (whether polygon or point) in order to get a p-value.

In order to determine if the existing spatial data is statistically dissimilar to the null hypothesis of ‘complete spatial randomness’ (CSR), we need to simulate / create a spatially random probability distribution (this can’t be done mathematically [e.g., looking up a value in a table] since each study is unique).

Monte Carlo Simulation produces several (e.g., 99) random simulations (or permutations) that the software then compares against your observed data.

This can be used to develop a pseudo p-value: the probability that an actual set of numbers was observed only by chance.

Page 30: Spatial Analysis I - UBC Blogsblogs.ubc.ca/advancedgis/files/2015/11/Lecture07SpatialAnalysisI.pdfSpatial analysis Turns raw data into useful information by adding greater informative

PermutationsEach observation is given a set of randomly generated coordinates (selected using a uniform random number generator, not a ‘normal’ or gaussian random distribution), which is used to relocate each observation in space. To generate a random reference distribution of Moran's I (or Ripley’s K), the statistic is computed each time with a different set arrangement of points for the number of permutations specified (e.g., 99).

You can then compare this reference distribution to your observed Moran's I value to determine where it falls in comparison. The upper and lower confidence bands are derived from the random permutations.

A uniform distribution for the role of a single dice:

Page 31: Spatial Analysis I - UBC Blogsblogs.ubc.ca/advancedgis/files/2015/11/Lecture07SpatialAnalysisI.pdfSpatial analysis Turns raw data into useful information by adding greater informative

A spatially random distribution(one of many simulations) Observed distribution

Our p-value answers the question - what is the probability that the observed distribution could have occurred by chance?

Statistical Significance

Page 32: Spatial Analysis I - UBC Blogsblogs.ubc.ca/advancedgis/files/2015/11/Lecture07SpatialAnalysisI.pdfSpatial analysis Turns raw data into useful information by adding greater informative

Permutations to compute confidence envelope.

Page 33: Spatial Analysis I - UBC Blogsblogs.ubc.ca/advancedgis/files/2015/11/Lecture07SpatialAnalysisI.pdfSpatial analysis Turns raw data into useful information by adding greater informative

Car thefts in Vancouverafter 8:00 pm

A clustered distribution

Page 34: Spatial Analysis I - UBC Blogsblogs.ubc.ca/advancedgis/files/2015/11/Lecture07SpatialAnalysisI.pdfSpatial analysis Turns raw data into useful information by adding greater informative

Step 4: Kernel density analysis

Kernel Density analysis calculates the density of features in a neighborhood around each features. It can be calculated for both point and line features.

While the inputs are either point or line features, the output is a raster since a field output is being created

Possible uses include calculating the density of houses, crime reports, or roads or utility lines and using that density in a regression analysis, for example.

You can use a ‘population field’ to weight some features more heavily than others, depending on their meaning, or to allow one point to represent several observations. Source

Still looking only at the spatial objects.

Page 35: Spatial Analysis I - UBC Blogsblogs.ubc.ca/advancedgis/files/2015/11/Lecture07SpatialAnalysisI.pdfSpatial analysis Turns raw data into useful information by adding greater informative

(A) A collection of point objects(B) A kernel function

A

The kernel’s shape depends on a distance parameter—increasing the value of the parameter results in a broader and lower kernel, and reducing it results in a narrower and sharper kernel. When each point is replaced by a kernel and the kernels are added, the result is a density surface whose smoothness depends on the value of the distance parameter.

B

Page 36: Spatial Analysis I - UBC Blogsblogs.ubc.ca/advancedgis/files/2015/11/Lecture07SpatialAnalysisI.pdfSpatial analysis Turns raw data into useful information by adding greater informative

Density estimation using two different distance parameters in the respective kernel functions.

(A) The surface shows the density of ozone-monitoring stations in California, using a kernel radius of 150 km

(B) Zoomed to an area of Southern California, a kernel radius of 16 km is too small for this dataset, as it leaves each kernel isolated from its neighbors

Page 37: Spatial Analysis I - UBC Blogsblogs.ubc.ca/advancedgis/files/2015/11/Lecture07SpatialAnalysisI.pdfSpatial analysis Turns raw data into useful information by adding greater informative

Car thefts in Vancouverafter 8:00 pm 50 m cell sizeand a 500 m neighbourhood

Page 38: Spatial Analysis I - UBC Blogsblogs.ubc.ca/advancedgis/files/2015/11/Lecture07SpatialAnalysisI.pdfSpatial analysis Turns raw data into useful information by adding greater informative

Step 5: Spatial patterns analyses+

Cluster and Outlier Analysisidentifies spatial clusters of features with high or low values,

as well as identifying spatial outliers(formally: Anselin’s Local Moran's I).

This is just one example of the different ways we can analyze spatial patterns. ESRI provides helpful

information about this and other methods on their Spatial Statistics Resources page.

We are now including both the spatial object and its attribute.

Page 39: Spatial Analysis I - UBC Blogsblogs.ubc.ca/advancedgis/files/2015/11/Lecture07SpatialAnalysisI.pdfSpatial analysis Turns raw data into useful information by adding greater informative

Spatial Patterns of Obesity and Associated Risk Factors in the Conterminous U.S

Source

Page 40: Spatial Analysis I - UBC Blogsblogs.ubc.ca/advancedgis/files/2015/11/Lecture07SpatialAnalysisI.pdfSpatial analysis Turns raw data into useful information by adding greater informative

Some notes on the lab.

Page 41: Spatial Analysis I - UBC Blogsblogs.ubc.ca/advancedgis/files/2015/11/Lecture07SpatialAnalysisI.pdfSpatial analysis Turns raw data into useful information by adding greater informative

What do these values represent?1.96 Z-score represents

0.05% of the curve (two-tailed)2.58 Z-score represents

0.01% of the curve

The Moran’s Index in and of itselfisn’t important—it is the z-scoreand the p-value that tell the tale.

WRT Lab 3 and the Moran I’s interpretation.

Page 42: Spatial Analysis I - UBC Blogsblogs.ubc.ca/advancedgis/files/2015/11/Lecture07SpatialAnalysisI.pdfSpatial analysis Turns raw data into useful information by adding greater informative

Test Statistic for Normal Frequency Distribution

0-1.96

2.5%

1.96

2.5% 1%

2.58

*technically –1/(n-1)

–1/(n-1)

Reject null at 5%Reject null Reject null at 1%

Null Hypothesis: no spatial autocorrelation*Moran’s I = 0

Alternative Hypothesis: spatial autocorrelation exists (and/or dispersion exists)*Moran’s I > 0 (clustering) or I < 0 (dispersion)

Reject Null Hypothesis if Z score is greater than or equal to 1.96 (less than or equal to -1.96)

Interpretation: less than a 5% chance that the spatial autocorrelation (dispersion) found is random, 95% confident that spatial auto correlation (dispersion) exits.

Page 43: Spatial Analysis I - UBC Blogsblogs.ubc.ca/advancedgis/files/2015/11/Lecture07SpatialAnalysisI.pdfSpatial analysis Turns raw data into useful information by adding greater informative

Summary0 These are just a few of the methods available to analyse

spatial data.0 You should explore ArcMap’s Spatial Analyst toolbox as

well as the Spatial Statistics toolbox, since within those toolboxes you can find many additional methods that might be of use in your projects.