Topic 4 – Geographical Data Analysis

GEOG 60 – Introduction to Geographic Information Systems

Professor: Dr. Jean-Paul Rodrigue

Topic 4 – Geographical Data Analysis

A – The Nature of Spatial AnalysisB – Basic Spatial Analysis

The Nature of Spatial Analysis

■ 1. Spatial Analysis and its Purpose■ 2. Spatial Location and Reference■ 3. Spatial Patterns■ 4. Topological Relationships

AA

Spatial Analysis and its Purpose

■ Conceptual framework• Search of order amid disorder.• Organize information in categories.

■ Method• Inducting or deducting conclusions from spatially related

information.• Deduction: Deriving from a model or a rule a conclusion.• Induction: Learning new concepts from examples.

• Spatial analysis as a decision-making tool.• Help the user make better decisions.• Often involve the allocation of resources.

11


■ Requirements• 1) Information to be analyzed

must be encoded in some way.• 2) Encoding implicitly requires a

spatial language.• 3) Some media to support the

encoded information. • 4) Qualitative and/or quantitative

methods to perform operations over encoded information.

• 5) Ways to present to results in an explicit message.

Information

Media

Encoding

Message

Methods

11


Human Geography

Historical Political Economic Behavioral Population

Geo

grap

hic

Tech

niqu

esGIS

Cartography

Quantitativemethods

Remote sensing

Physical G

eographyGeomorphology

Climatology

Biogeography

SoilsSpatialAnalysis

11

Mapping Deaths from Cholera, London, 1854 (Snow Study)11


■ Data Retrieval • Browsing; windowing (zoom-in &

zoom-out).• Query window generation (retrieval of

selected features).• Multiple map sheets observation.• Boolean logic functions (meeting

specific rules).■ Map Generalization

• Line coordinate thinning of nodes.• Polygon coordinate thinning of nodes.• Edge-matching.

11

SHP

DBHD


■ Map Abstraction • Calculation of centroids.• Visual editing & checking.• Automatic contouring from randomly

spaced points.• Generation of Thiessen / proximity

polygons.• Reclassification of polygons.• Raster to vector/vector to raster

conversion.■ Map Sheet Manipulation

• Changing scales.• Distortion removal/rectification.• Changing projections.• Rotation of coordinates.

11

57.54.5


■ Buffer Generation• Generation of zones around certain

objects.■ Geoprocessing

• Polygon overlay.• Polygon dissolve.• “Cookie cutting”.

■ Measurements• Points - total number or number

within an area.• Lines - distance along a straight or

curvilinear line.• Polygons - area or perimeter.

11

6


■ Raster / Grid Analysis • Grid cell overlay.• Area calculation.• Search radius.• Distance calculations.

■ Digital Terrain Analysis • Visibility analysis of viewing points.• Insolation intensity.• Grid interpolation.• Cross-sectional viewing.• Slope/aspect analysis.• Watershed calculation.• Contour generation.

11

15

Spatial Patterns

■ Relativity of objects• Definition of an object in view of

another.• Create spatial patterns.

■ Main patterns• Size.• Distribution/spacing : Uniform,

random and clustered.• Proximity.• Density: Dense and dispersed.• Shape.• Orientation.• Scale.

33

Size

Form

Orientation

Scale

Proximity

Spatial Patterns

■ Spatial autocorrelation• Set of objects that are spatially associated.• Relationship in the process affecting the object.• Negative autocorrelation.• Positive autocorrelation.

33

Uniform ClusteredPositive autocorrelation

Random

Topological Relations

■ Proximity• Qualitative expression of

distance.• Link spatial objects by their

mutual locations.• Nearest neighbors.• Buffer around a point or a line.

■ Directionality

44


■ Adjacency• Link contiguous entities.• Share at least one common

boundary.■ Intersection■ Containment

• Link entities to a higher order set.

City A

City B

44


■ Connectivity• Adjacency applied to a network.• Must follow a path, which is a set

of linked nodes.• Shortest path.• All possible paths.

1

2 3

4

5 6

44


■ Intersection• What two geographical objects

have in common.■ Union

• Summation of two geographical objects.

■ Complementarity• What is outside of the

geographical object.

Suitable for agriculture

Arable land Flat land

Land

Non arable land

44

Elementary Spatial Analysis

■ 1. Statistical Generalization■ 2. Data Distribution■ 3. Spatial Inference

BB

Statistical Generalization

■ Maps and statistical information• Important to display accurately the underlying distribution of data.• Data is generalized to search for a spatial pattern.• If the data is not properly generalized, the message may be

obscured.• Balance between remaining true to the data and a generalization

enabling to identify spatial patterns.• Thematic maps are a good example of the issue of statistical

generalization.

11


0-30

31-65

65-

ClassificationData Spatial Pattern

15258834567926145773921

11


■ Number of classes• Too few classes: contours of data distribution is obscured.• Too many classes: confusion will be created.• Most thematic maps have between 3 and 7 classes.• 8 shades of gray are generally the maximum possible to tell

apart.

11


■ Classification methods• Thematic maps developed from the same data and with the

same number of classes, will convey a different message if the ranging method is different.

• Each ranging method is particular to a data distribution.

11

Data Distribution

■ Histogram• The first step in producing a thematic map.• See how data is distributed.• Use of basic statistics such as mean and standard deviation.• An histogram plots the value against the frequency.

22

Value

Frequency

Uniform Normal Exponential

Data Distribution

■ Equal interval• Each class has an equal range

of values.• Difference between the lowest

and the highest value divided by the number of categories.

• (H-L)/C• Easy to interpret.• Good for uniform distributions

and continuous data.• Inappropriate if data is clustered

around a few values.Value

Frequency

HL

C1 C2 C3 C4

22

Data Distribution

■ Quantiles• Equal number of observations in

each category.• n(C1) = n(C2) = n(C3) = n(C4).• Relevant for evenly distributed

data.• Features with similar values may

end up in different categories.■ Equal area

• Classes divided to have a similar area per class.

• Similar to quantiles if size of units is the same.

Value

Frequency

n(C

1)

n(C

2)

n(C

3)

n(C

4)

C1 C2 C3 C4

22

Data Distribution

■ Standard deviation• The mean (X) and standard

deviation (STD) are used to set cutpoints.

• Good when the distribution is normal.

• Display features that are above and below average.

• Very different (abnormal) elements are shown.

• Does not show the values of the features, only their distance from the average.

Value

Frequency

C1 C2 C3 C4

X-1STD +1STD

22

Data Distribution

■ Arithmetic and geometric progressions• Width of the class intervals are

increased in a non linear rate.• Good for J shaped distributions.

Value

Frequency

C1 C2 C3 C4

22

Data Distribution

■ Natural breaks• Complex optimization method.• Minimize the sum of the variance

in each class.• Good for data that is not evenly

distributed.• Statistically sound.• Difficult to compare with other

classifications.• Difficult to choose the

appropriate number of classes.

Frequency

Value

C1 C2 C3 C4

22

Data Distribution

■ User defined• The user is free to select class intervals that fit the best the data

distribution.• Last resort method, because it is conceptually difficult to explain

its choice.• Analysts with experience are able to make a good choice.• Also used to get round numbers after using another type of

classification method.• $5,000 - $10,000 instead of $4,982 - $10,123.

■ Using classification• Classification can be used to deliberately confuse or hide a

message.

22

Data Distribution

“no problems” - Equal steps

“there is a problem” - Quantiles

22

Data Distribution

“everything is within standards” - standard deviation

22

33 Spatial Inference

■ Filling the gaps• Sampling shortens the time necessary to collect data.• Requires methods to “fill the gaps”.

■ Interpolation and extrapolation• Data at non-sampled locations can be predicted from sampled

locations.• Interpolation:

• Predict missing values when bounding values are known.• Extrapolation:

• Predict missing values outside the bounding area.• Only one side is known.

Spatial Inference: Interpolation and Extrapolation

Interpolation line

Sample

Interpolation line

Extrapolation line

Number of vehicles

De

lay

at th

e tr

affi

c lig

htH

eig

ht

Location

Sample

33

Spatial Inference: Best Fit33

y = 0.1408x + 116.69R2 = 0.6779

96

98

100

102

104

106

108

110

112

-130 -120 -110 -100 -90 -80 -70 -60

Longitude

Sex R

ati

o

Spatial Inference

■ Aggregation• Data within a boundary can be aggregated.

• Often to form a new class.■ Conversion

• Data from a sample set can be converted for a different sample set.

• Changing the scale of the geographical unit.• Switching from a set of geographical units to another.

33

Spatial Inference: Aggregation and Conversion

Pine Trees

Poplar Trees

Boreal Forest

District A

District BDistrict B1

District B2

33

Documents

Topic 4 – Geographical Data Analysis