View
390
Download
0
Tags:
Embed Size (px)
DESCRIPTION
ESRI Developer Meetup June 7 2011 - Conflation, Data Quality and Record-Level Metadata
Citation preview
04/12/2023U.S. Environmental Protection Agency
1
Conflation, Data Quality and MADness
ESRI Developer MeetupJune 7th, 2011
USEPA Office of Environmental InformationDavid G Smith PE PLS 202-566-0797
[email protected]:@DruidSmith
Metadata??
04/12/2023 U.S. Environmental Protection Agency 2
FRS Overview
• Facility Registry System• FRS is a data aggregator• FRS performs integration, validation and QA
across over 30 federal databases and over 50 state, territory and tribal databases
• FRS contains information on nearly 2.8 million facilities
• > 80% of facilities have lat/long information
• FRS improves program facility data validity from 40—95% by selecting best contact and location information from multiple data sources
• Allows EPA, public, academic, and investment communities to evaluate compliance with environmental regulations
• Provides robust, complete view of facility information, facilitating cross-media analyses:– Community-based initiatives– Environmental justice analyses– NEPA assessments– Emergency response– Other mission needs (TMDL program, climate change analysis, etc.)
04/12/2023 U.S. Environmental Protection Agency 4
What FRS Does
FRS Features• Provides a more complete, holistic, cross-media view of key
facility information– through verification and – data management procedures
• Incorporates layers of quality control – the FRS record is checked for completeness, consistency, and validity and is owned by FRS
• Integrates information from program national systems, state master facility records, tribal partners, and other federal agencies
• Supported by a network of data stewards covering– both geographic and – programmatic areas of expertise.
• Fully integrated with the Locational Data and the Integrated Error Correction Process (IECP)
5
FRS Features• Provides essential support for applications that rely
on integrated views of facilities
– GIS applications (EnviroMapper, MyEnvironment)– Public access applications (Envirofacts, Cleanups in
My Community (CIMC)– Enforcement systems and applications (IDEA, OTIS,
ECHO, ICIS)• Offers specialized services to applications in need of
accurate facility information– Emergency Response– TRI-ME web– DMR Loadings Tool
• Provides web services, enabling data exchanges with state partners on the Environmental Exchange Network
6
FRS ScopeMajor Programs Represented in FRS
http://www.epa.gov/enviro/html/frs_demo/new_crosswalks.html
• AirAFS AQSCAMDBS EGRIDNEI RBLCRFS (Ethanol)
• WaterPCS ICIS-
NPDESSDWIS CWNS
• Chemical ReleasesTRIS RMPTSCA SSTS FRP BRAC
• Hazardous WasteACRES CERCLISRCRAINFO RADINFO
• Enforcement/ComplianceICIS ECRMNCDB
• SchoolsNCES GNIS BIA INDIAN SCHOOL
• Other
LANDFILL
FRS Data Model
IndividualIndividual
Organization
IndividualIndividual
Mailing Address
AffiliationAffiliation
Environmental InterestEnvironmental Interest
IndustrialClassificationIndustrialClassification
SupplementalInterest
AlternativeName
Geospatial
Facility/Site
High Level Data Model
FRS Data Pipeline
Clean & Validate
• Geo-codes & parses addresses
Integrate & Match
• Assigns a unique ID to each facility record
Select Best Pick
• Uses business rules to select the best contact/address & location
QA Process
FormatAddresses
GeocodeAddresses
Standardize and Validate
Geo Coordinates
Determine Facility Best Coordinate
FacilityAddresses
StandardFormat
Addresses
Program and State Geo
Coordinates
ValidatedGeo
Coordinates
Best GeographicCoordinates
FRS Facility Geocoded
Coordinates
Integration?
Air Permit Coordinate
Water PermitCoordinate
Toxics PermitCoordinate
Best Facility Coordinate?
Locational Data Accuracy and Best Pick
• FRS utilizes the EPA Lat/Long Data Standard• Locational Reference Tables (LRT)• Method Accuracy Description (MAD)• Best Pick
LRT Record IDConveyor of
Record
Program System Name
Program System ID
Program System
Subentity ID
Program Latitude
Program Longitude
Best ValueCollection
MethodAccuracy
ValueScale MOD Score
Reference Point
Insertion Date
Coordinate Source
Map Coordinate
12135178 CEDS CEDS 200000072141 37.4511 -77.4339 N 932.3551 39105 MAP
12651135 NEI NEI NEIVA2561 37.450939 -77.434273 N UNKNOWN 932.3551AIR RELEASE
STACK 39105 MAP
14542018 RCRIS RCRAINFO VAD009305137 37.451667 -77.433333 N 1137.0184 39112REGULATED
ENTITY MAP
15512736 RMP RMP 1E+11 37.451917 -77.433361 N 2.37CENTER OF
FACILITY 39967 MAP
15727233 PCS PCS VA0004669 001N9 37.448888 -77.423888 N 898.2445WATER
RELEASE PIPE 39234 MAP
15727234 PCS PCS VA0004669 101R9 37.448888 -77.423888 N 1.58WATER
RELEASE PIPE 39234 MAP
15727235 PCS PCS VA0004669 101N9 37.448888 -77.423888 N 1.58WATER
RELEASE PIPE 39234 MAP
15727236 PCS PCS VA0004669 101B9 37.448888 -77.423888 N 898.2445WATER
RELEASE PIPE 39234 MAP
15727237 PCS PCS VA0004669 101A9 37.448888 -77.423888 N 1.58WATER
RELEASE PIPE 39234 MAP
15727238 PCS PCS VA0004669 102N9 37.448888 -77.423888 N 1.58WATER
RELEASE PIPE 39234 MAP
15727239 PCS PCS VA0004669 37.451111 -77.433889 NINTERPOLATI
ON-MAP 50 24000 2.37FACILITY
CENTROID 39819 MAP
15727240 PCS PCS VA0004669 003A9 37.454166 -77.3875 N 1.94WATER
RELEASE PIPE 39234 MAP
15727241 PCS PCS VA0004669 002A9 37.454166 -77.3875 N 1.94WATER
RELEASE PIPE 39234 MAP
15727242 PCS PCS VA0004669 103N9 37.448888 -77.423888 N 1.58WATER
RELEASE PIPE 39234 MAP
16137349 AIRS/AFS AIRS/AFS 5104100001 37.451111 -77.433889 N 932.3551 39819 MAP
16137350 AIRS/AFS AIRS/AFS 5104100001 17 37.451111 -77.434167 N 932.3551 39819 MAP
16137351 AIRS/AFS AIRS/AFS 5104100001 16 37.451111 -77.434167 N 932.3551 39819 MAP
16137352 AIRS/AFS AIRS/AFS 5104100001 15 37.451111 -77.434167 N 932.3551 39819 MAP
16137353 AIRS/AFS AIRS/AFS 5104100001 14 37.451111 -77.434167 N 932.3551 39819 MAP
16137354 AIRS/AFS AIRS/AFS 5104100001 12 37.451111 -77.434167 N 932.3551 39819 MAP
16137355 AIRS/AFS AIRS/AFS 5104100001 10 37.451111 -77.434167 N 932.3551 39819 MAP
16137356 AIRS/AFS AIRS/AFS 5104100001 9 37.451111 -77.434167 N 932.3551 39819 MAP
16137357 AIRS/AFS AIRS/AFS 5104100001 8 37.451111 -77.434167 N 932.3551 39819 MAP
16137358 AIRS/AFS AIRS/AFS 5104100001 6 37.451111 -77.434167 N 932.3551 39819 MAP
16137359 AIRS/AFS AIRS/AFS 5104100001 19 37.451111 -77.434167 N 932.3551 39819 MAP
16137360 AIRS/AFS AIRS/AFS 5104100001 18 37.451111 -77.434167 N 932.3551 39819 MAP
16137361 AIRS/AFS AIRS/AFS 5104100001 2 37.451111 -77.434167 N 932.3551 39819 MAP
16137362 AIRS/AFS AIRS/AFS 5104100001 4 37.451111 -77.434167 N 932.3551 39819 MAP
16137363 AIRS/AFS AIRS/AFS 5104100001 1 37.451111 -77.434167 N 932.3551 39819 MAP
16137364 AIRS/AFS AIRS/AFS 5104100001 5 37.451111 -77.434167 N 932.3551 39819 MAP
16446261TRIS-
PREFERRED TRIS23234DPNTSUSHI
G 37.451667 -77.435 N UNKNOWN 28.6397 UNKNOWN 39323 MAP
16446262TRIS-
REPORTED TRIS23234DPNTSUSHI
G 37.451666 -77.435 N 898.2445 40265 MAP
17937134 PCS PCS VA0004669 101O9 37.445833 -77.429167 N 898.2445WATER
RELEASE PIPE 39819 MAP
All underlying information from programs is
retained, to include locational data
For any given facility, there may be multiple individual locations that have been gathered, e.g.
an associated air stack location, water outfall location, front
gate location, et cetera
http://www.epa.gov/enviro/html/locational/lrt_viewer.html
MAD Codes help us to assess how to handle locational data quality as well as understanding what it represents
Locational Reference Table
MAD Codes
• MAD Codes help us to assess how to handle locational data quality
• As well as understanding what it represents
MAD Codes
http://www.exchangenetwork.net/standards/Lat_Long_Standard_08_11_2006_Final.pdf
• FRS maintains a database table of manual verifications in the LRT.
– EPA/Regional verifications trump State verifications.– Manually verified locations trump all the rest regardless of
calculated accuracy or qa checks.
• In automated processing, Superfund NPL Site locations trump everything
• Our “normal” process is based on supplied or implied accuracy and QA checks performed (MAD codes).
– EPA Latitude/Longitude Data Standard (http://www.exchangenetwork.net/standards/Lat_Long_Standard_08_11_2006_Final.pdf)
Select the “Best Pick” Information
• Users benefit from high quality integrated locational data for facilities toward enforcement, compliance, analysis, assessment and community impact
• Being able to assess and manage large amounts of data of varying quality, e.g. VGI
Business Case
Thank You - URLs
Topic URL
FRS Home Site http://www.epa.gov/enviro/html/fii/
FRS Geodata Download
http://www.epa.gov/enviro/geo_data.html
My Environment http://www.epa.gov/myenvironment/
EPA Geospatial Program
http://www.epa.gov/geospatial/index.html
EPA Geodata Gateway
https://geogateway.epa.gov/geoportal/catalog/main/home.page
EPA Geo Metadata
https://geogateway.epa.gov/EME/