Spatial Analysis:

1/3/2008 GISC 6382 Applied GIS Briggs UT-Dallas 1

Spatial Analysis:

Implementation in ArcGIS

ArcGIS• Most common software used for GIS• From ESRI, Inc (Environmental Systems Research Institute.

Inc.) in Redlands, CA– Founded and owned by Jack Dangermond

• Released ArcInfo in 1971, the first commercial vector-based GIS system

• Three main modules– ArcMap for map production and analysis– ArcCatalog for data management– ArcToolbox contains all the tools in ArcMap and ArcCatalog, plus

many more

ArcGIS Modules or Components

• ArcMap – for map production and analysis

• ArcCatalog – for data management

• ArcToolbox – contains all the tools in ArcMap and ArcCatalog,

plus many more

Each has a different user interface

The Levels of ArcGIS

Costs more $sDoes more

• ArcView • Map production and

analysis

• ArcEditor• Data creation

• ArcInfo• High level analysis

Each has the same user interface


Primarily carried out in ArcMap:• via Selection/Select by Location

– this selects features of one layer(s) which relate in some specified spatial manner to the features in another layer

– if desired, selected features may be saved later as a new layer using Data/Export Data

– geographic features are not themselves modified• via Spatial Join (right click layer in Table of Contents , select Join/Joins and Relates, then click

down arrow in first line of Join Data window)– Use for: points in polygon (identifies polygon in which point is located)

lines in polygon (identifies polygons crossed by line)points on lines (to calculate distance to nearest line)points on points (to calculate distance to nearest point)

– operate on tables and normally creates a new table with additional variables, but again does not modify geographic features

• via ArcToolbox – Generally these tools modify geographic feature, thus they create a new layer (e.g.

shape file)– Tools are organized into multiple categories

Implementing Spatial Analysis in ArcGIS 9


Differences• Selection: simply selects spatial features

– Spatial features are not modified.– Selected features are highlighted on map and in the table– No new output file saved unless you use Export/data

• joins: operate on tables and normally adds additional fields or variables (columns), but again does not modify actual spatial features (rows) – Normally, adds attribute variables (columns) to one layer’s table from another layer’s

table– All geographic features are “output” and no features are modified– No new output file saved unless you use Export/data

• Analysis Toolbox (and others) in ArcToolbox – Often these modify or create spatial features thus they output new spatial files

Different approaches can be used, in some cases, to produce same results.


Analysis Tools in ArcToolboxArcToolbox, particularly the Analysis Tools toolbox contains

• Extract toolset, including– Clip which limits one layer to the exact outer boundary of another layer

(e.g. limit a Texas road layer to Dallas county only)• Overlay toolset, including

– Intersect, which combines two polygon layers--with output limited to common area

– Union, which combines two polygon layers--with output covering full extent of both layers

• Proximty toolset, including– Buffer, for creating buffer polygons at a specified distance around points, lines or

polygons– Point Distance, for calculating distances between points within a specified radius

• Statistics toolset, including– Frequency, which gives you counts of attribute value combinations– Summary Statistics, which gives you summary descriptive statistics for columns in

a table, including sum, mean, min, max, etc..

Tools useful for analysis of vector data are located in other toolsets as well!!! For example:

– Data Management Tools>Generalization, contains• Dissolve, which removes boundaries between polgyons

Objective

• The objective of the research is to answer the question:

Are certain groups (the poor, or racial and ethnic minorities) more exposed to pollution than the population as a whole?

Method 1: all schools

• We have data for Dallas, TX on:– Toxic emission sites– Demographic characteristics of schools

• We will create a 1 mile buffer around each toxic site

• We will compare the demographic characteristics of schools within the buffer to those outside the buffer

Hypothesis 1

• Null Hypothesis: there will be the same percentage for each demographic group beyond and within the buffer

• Alternative hypotheses: there will be a higher percentage of Black, Low Income, Hispanic, and Asian students, and a lower percentage of white students within the buffer.

Method 2: schools within buffer only• We have data for Dallas, TX on:

– Toxic emission sites– Demographic characteristics of schools

• We will create a toxicity score for each school within 1 mile of a toxic site based on distance to the toxic site and the toxicity of the site

• We will compare the demographic characteristics of the ten (10) schools with the highest scores to the rest of the schools within the buffer.

Hypothesis 2

• Null Hypothesis: there will be the same percentage for each group in the Top Ten schools compared with the rest of the buffer schools

• Alternative hypotheses: there will be a higher percentage of Black, Low Income, Hispanic, and Asian students and a lower percentage of white students in the Top Ten toxic schools.

1. Bring in the data files (C:\Users\briggs\Documents\china\lectures\project-armap\data)Dal_toxic_SPCS,shp contains the toxic sites and their toxicity scoresDal_school_SPCS.shp contains the location of the schoolsHighways_NCTCOG_SPCS.shp contains major roads in the Dallas Fort Worth area County_NCTCOG.shp contains the county outlines for the counties in the Dallas/Fort Worth areaDal_sch_stats.dbf contains the demographic statistics

2. Identify school within 1 mile of toxic sites

-- use Selection>Select by location to select all schools within distance of 1 mile (5280 feet) from Dal_toxic

--select features from Dal_school--that “are within distance”--Dal_toxic_SPCS

--add variable (e.g called buffer) to Dal_schools and code for inside(=1) and outside (=0) --open attribute table for Dal_school

--click Options and Add Field (short integer) called Buffer--right click column heading for buffer field (Dal_school_SPCS.buffer)

--select Field Calculator--calculate Dal_school_SPCS.buffer = 1--be sure there is a in Calculate selected records only

--Clear Selected Features after doing this (important!)

3. Obtain student count totals within and beyond buffer.--Join Dal_school_SPCS layer with Dal_sch_stats table

--right click on Dal_school_SPCS and select Joins and Relates>Join-- Box 1 ORG_NUM (for Dal_school_SPCS)--Box 2 Dal_sch_stats--Box 3 CAMPUS (for Dal_sch_stats in Box 2)

--open the attribute table for Dal_school_SPCS ---be sure join it is correct!

--click on heading for Dal_school_SPCS.buffer,select Summarize and Sum the six Dal_sch_stats demographic fields(Dal.sch.stats.CPETALLC, CPETBLAC, CPETECOC, CPETHISC, CPETPACC, CPETWHIC)

--click expansion box, then check SUM --name the output table Sum_buffer: -- the output table should have two rows

--1= within buffer--0= outside buffer

--it is saved as .dbf file which can be read by Excel

4. Use Excel to open Sum_buffer.dbf and calculate the percentages for within and beyond buffer--you may need to close ArcMap--you should calculate percentages relative to total within buffer and beyond buffer--this is the row sum given by the variable Sum_CPETAL

Null Hypothesis: there will be the same percentage for each group beyond and within the bufferAlternative hypotheses: there will be a higher percentage of Black, Low Income, Hispanic, and Asian students and a lower percentage of white students within the buffer.

buffer Count Total Black-AfAm

Low Income

Hispanic

Asian White

Beyond (0)

378 270,742

79,994

143,326

102,226

10,304

77,020

Within (1)

95 61,750

11,948

38,428

33,199

2,328

13,997

Percentages relative to total within and total beyond (row sum)beyond 29.5 52.9 37.8 3.8 28.4

within 19.3 62.2 53.8 3.8 22.7Supports hypothesis?

No Yes Yes No Yes

5. We need to calculate a toxicity score for each school based on distance to toxic sites and the toxicity score of the site. The ArcToolbox tool called POINTDISTANCE will calculate distance to all points within a given radius. However, it is only available in ArcInfo—the “top level” version of ArcGIS. To run POINTDISTANCE (if you have ArcInfo), go to ArcToolbox>Analysis>Proximity

Input features: dal_schoolsNear features: dal_toxOutput table: sch_tox_disSearch radius: 5280 feet

The sch_tox_dis table is available in the data folder if you do not have ArcInfo.

6. Add sch_tox_dis.dbf table to ArcMap and open it (right click on layer name and select Open)--INPUT_FID is the feature identification (ID) number for schools --NEAR_FID is the ID for toxic sites within 1 mile (5280 feet) of a schoolSort by INPUT_FID (right click column name and select Sort Ascending)

--note how some schools are listed multiple times (e.g. #5)--there are multiple toxic site within 1 mile

Right click on sch_tox_dis (join table) and join with dal_Tox_spcs (target table) using: Box 1 Near_ID Box 2 dal_Toxix_SPCS Box 3 FID This adds toxic scores for each toxic site tosch_tox_dis table. It is a “one to many” join

7. Weight toxic score by distance from school to toxic site. Add variable (field): Open table, click Options and then Add Field Score_dist as type Float (Floating point)--its added “in the middle” of the columns as sch_tox_di.Score_Dist To calculate values: right click on this name and select Field Calculator

Yes, to calculate outside an edit session.Build the expression:

sch_tox_dis.Score_Dist =[Dal_toxic.SCORE] / [sch_tox_dis.DISTANCE]Remove join by right clicking on table name (dal_tox_SPCS) and select Joins--Table has 5 variables and 135 observations (OID 0-134)

8. Aggregate observations in sch_tox_dis table by school

Sort sch_tox_dist. by INPUT_FID (schools) to see need for aggregation by school (multiple toxic sites per school—note #5 and #34)

Right click on Score_dist column heading and select Summarize1. Field to summarize: INPUT_FID (school) 2. Summary statistics Click expansion box for Score_dis and select SUM3, Output table is: sch_tox_scoreOutput table has four variables and 95 observations (rows): OID, INPUT_FID (school ID), frequency count, sum-score_dis

9. Identify “Top 10” schools with highest toxicity scores Find top ten:--Open sch_tox_score table--right click on the heading Sum_Score_Dist and select Sort Descending--select the top 10 schools by dragging mouse pointer down the grey boxes on the left side of the table (selected rows are highlighted in blue)

Create variable to identify Top 10: --open the sch_tox_score table (if not already open)--click Options (at bottom of table) and Add Field (type: short integer) called TopTen--click column heading for TopTen --select Field Calculator--calculate TopTen = 1--Calculation is applied to the “selected fields” only

10. Map the top 10 toxic schools using Proportional Symbols.

Join sch_tox_score with spatial layer:(sch_tox_score is a table and not a spatial layer. We must first join it to the original school layer (dal_school_SPCS) for mapping.)Right-click on dal_school_SPCS and select Joins and Relates>Join Box 1 Dal_school_SPCS_FIDBox 2 sch_tox_scoreBox 3 INPUT_FID Important: In the Join Options box, check

This will keep only the schools within the 1 mile buffer

Map the schools:Right click Dal_school_SPCS and go to Properties>Symbology>Quantities>Proportional In Value box, select Sum_Score_DistClick Exclude, and then build the query sch_tox_score.TopTen < 1(this excludes all values less than 1 from the map--you just get the top 10)To change the size of the symbols, click Min Value box

--change Size to 25 (or whatever value you think looks best)

11. Obtain student totals for ten schools with highest toxic score (Top 10).

Obtain student demographic data:(not necessary if you are continuing from Part I and have not removed any joins)--Join Dal_sch_stats table to Dal_school_SPCS --Right-click on dal_school_SPCS and select Joins and Relates>JoinBox 1 Dal_school_SPCS.ORG_NUM ( ID of join field for Dal_school_SPCS) Box 2 Dal_sch_stats Box 3 CAMPUS (ID of join field for Dal_sch_stats)

Sum demographic data:--open the Dal_school_SPCS table--right click on variable name TopTen and select Summarize

Box 1 sch_tox_score.TopTenBox 2 click expansion box next to student count variables and select Sum (Dal.sch.stats.CPETALLC, CPETBLAC, CPETECOC, CPETHISC,

CPETPACC, CPETWHIC)Box 3: name the output table Sum_TopTen --output table Sum_TopTen will have two rows TopTen = 1 are totals for top tenTopTen = 0 are totals for rest within the buffer

1. Copy the two rows from Sum_TopTen.dbf into the spreadsheet

--calculate percentages relative to row sum (given by CPETALLC-- labeled Total below)

Counts for TopTen schools compared with all others within the buffer

Count Total Black-AfAm Low Income Hispanic Asian White

TopTen 10 5,626 1,623 2,979 2,124 204 1,645

rest in buffer 56,124 10,325 35,449 31,075 2,124 12,352

Percents for TopTen schools compared with all others within the buffer (relative to row sum)

TopTen 28.8 53.0 37.8 3.6 29.2

rest in buffer 18.4 63.2 55.4 3.8 22.0

Supports hypothesis? yes no no no no