27
GIS Data Quality GIS Data Quality Producing better data Producing better data quality through robust quality through robust business processes business processes BrightSta r TRAINING Kim Ollivier

GIS Data Quality Producing better data quality through robust business processes BrightStar TRAINING Kim Ollivier

Embed Size (px)

Citation preview

Page 1: GIS Data Quality Producing better data quality through robust business processes BrightStar TRAINING Kim Ollivier

GIS Data QualityGIS Data Quality

Producing better data quality Producing better data quality through robust business through robust business

processesprocessesBrightStar

TRAININGKim Ollivier

Page 2: GIS Data Quality Producing better data quality through robust business processes BrightStar TRAINING Kim Ollivier

Schedule Day 2Schedule Day 2

Suggested breaks for the following times: Start: 9:00

Session 1 ( 90 min)Morning tea: 10:30 to 10:45

Session 2 ( 105 min)Lunch: 12:30 to 1:30

Session 3 ( 90 min) Afternoon tea: 3:00 to 3:15

Session 4 ( 105 min)Finish: 5:00

Each session will have an exercise or interactive discussion

Page 3: GIS Data Quality Producing better data quality through robust business processes BrightStar TRAINING Kim Ollivier

TopicsTopics

•Metadata

•Designing rules

•Data warehouse and ETL

•Feature maintenance

Page 4: GIS Data Quality Producing better data quality through robust business processes BrightStar TRAINING Kim Ollivier

MetadataMetadata

Data modelData model Business rules, relations, stateBusiness rules, relations, state Subclasses (lookup tables)Subclasses (lookup tables) GIS Metadata NZGLS and ISO XMLGIS Metadata NZGLS and ISO XML Readme.txt or readme.htmlReadme.txt or readme.html

Page 5: GIS Data Quality Producing better data quality through robust business processes BrightStar TRAINING Kim Ollivier

MetadataMetadata

Which standard?Which standard? ISO 19115, NZGMSISO 19115, NZGMS Aust asdd.ga.gov.auAust asdd.ga.gov.au

Page 6: GIS Data Quality Producing better data quality through robust business processes BrightStar TRAINING Kim Ollivier

Examine MetadataExamine Metadata

Geospatial metadataGeospatial metadata Benefit to users or producer?Benefit to users or producer? How do we collect it?How do we collect it? Standardisation or not?Standardisation or not? metadata\topo250k_metadata.htmlmetadata\topo250k_metadata.html metadata\metadata\DCW_DQ_Project.htmDCW_DQ_Project.htm metadata\metadata\meta.htmlmeta.html

Morning Tea

Page 7: GIS Data Quality Producing better data quality through robust business processes BrightStar TRAINING Kim Ollivier

Data Quality RulesData Quality Rules

Attribute domain constraintsAttribute domain constraints Relational integrity rulesRelational integrity rules Rules for historical dataRules for historical data Rules for state-dependent objectsRules for state-dependent objects General dependency rulesGeneral dependency rules Spatial feature rulesSpatial feature rules

Page 8: GIS Data Quality Producing better data quality through robust business processes BrightStar TRAINING Kim Ollivier

A GIS Data A GIS Data Quality SystemQuality System

Assess

Data Quality AssessmentData Profiling

Improve Prevent Recognise

Data CleaningMonitoring

Data IntegrationInterfaces

Ensuring Quality ofData Conversionand Consolidation

Building DataQuality Metadata

Warehouse

Monitor

Recurrent Data QualityAssessment

Page 9: GIS Data Quality Producing better data quality through robust business processes BrightStar TRAINING Kim Ollivier

Assessing QualityAssessing Quality

Project stepsProject steps Required rolesRequired roles Defining the objectivesDefining the objectives Designing rulesDesigning rules Scorecard and MetadataScorecard and Metadata Frequency of assessmentFrequency of assessment

Page 10: GIS Data Quality Producing better data quality through robust business processes BrightStar TRAINING Kim Ollivier

Building RulesBuilding Rules

Data profilingData profiling Interview usersInterview users Examine data modelExamine data model Data GazingData Gazing Application v data matrixApplication v data matrix

Page 11: GIS Data Quality Producing better data quality through robust business processes BrightStar TRAINING Kim Ollivier

Attribute Domain ConstraintsAttribute Domain Constraints

Lookup tablesLookup tables Numeric rangesNumeric ranges Null valuesNull values Blank valuesBlank values Format constraintsFormat constraints PrecisionPrecision Complex domain restraintsComplex domain restraints

Page 12: GIS Data Quality Producing better data quality through robust business processes BrightStar TRAINING Kim Ollivier

Relational Integrity RulesRelational Integrity Rules

Identity ruleIdentity rule Reference rulesReference rules Cardinal rulesCardinal rules Inheritance rulesInheritance rules

Page 13: GIS Data Quality Producing better data quality through robust business processes BrightStar TRAINING Kim Ollivier

Historical DataHistorical Data

Time dependent attributeTime dependent attribute Value constraintsValue constraints Rates of changeRates of change VolatilityVolatility ContinuityContinuity GranularityGranularity

Page 14: GIS Data Quality Producing better data quality through robust business processes BrightStar TRAINING Kim Ollivier

State-dependent ObjectsState-dependent Objects

State-transition modelsState-transition models States, terminatorsStates, terminators ActionsActions

start

Terminated(T)

On Leave(L)

Active(A)

Retired(R)

Deceased(D)

Page 15: GIS Data Quality Producing better data quality through robust business processes BrightStar TRAINING Kim Ollivier

Event HistoriesEvent Histories

An object may have many eventsAn object may have many events Event OverlapsEvent Overlaps Event FrequenciesEvent Frequencies Event ConditionsEvent Conditions

Page 16: GIS Data Quality Producing better data quality through robust business processes BrightStar TRAINING Kim Ollivier

Spatial RulesSpatial Rules

Projection, unitsProjection, units Dimensions 2D,3D,M,ZDimensions 2D,3D,M,Z point,line,polypoint,line,poly PrecisionPrecision TopologyTopology

Page 17: GIS Data Quality Producing better data quality through robust business processes BrightStar TRAINING Kim Ollivier

Valuation RollValuation Roll

Legacy structure, 50 years oldLegacy structure, 50 years old Variable maintenance standardVariable maintenance standard Valuer General audit (DQ spec)Valuer General audit (DQ spec)

Page 18: GIS Data Quality Producing better data quality through robust business processes BrightStar TRAINING Kim Ollivier

Rules ExerciseRules Exercise

Split into pairsSplit into pairs Examine sample DVR datasetExamine sample DVR dataset Devise some rules for each categoryDevise some rules for each category

Verbal discussion with classVerbal discussion with class

Lunch

Page 19: GIS Data Quality Producing better data quality through robust business processes BrightStar TRAINING Kim Ollivier

Data Warehouse & ETLData Warehouse & ETL

Why not direct access to online DB?Why not direct access to online DB? Staging AreaStaging Area Scripting toolsScripting tools Trade-offsTrade-offs KPI for projectKPI for project

• better quality than sourcebetter quality than source• better quality than targetbetter quality than target

Page 20: GIS Data Quality Producing better data quality through robust business processes BrightStar TRAINING Kim Ollivier

ETL ExtractETL Extract

ExtractExtract

Page 21: GIS Data Quality Producing better data quality through robust business processes BrightStar TRAINING Kim Ollivier

ETL TransformETL Transform

The importance of primary keysThe importance of primary keys

Page 22: GIS Data Quality Producing better data quality through robust business processes BrightStar TRAINING Kim Ollivier

ETL LoadETL Load

Batch offline most commonBatch offline most common Daily status usually enoughDaily status usually enough

Page 23: GIS Data Quality Producing better data quality through robust business processes BrightStar TRAINING Kim Ollivier

Safe Software FMESafe Software FME

ExamplesExamples

Afternoon Tea

Page 24: GIS Data Quality Producing better data quality through robust business processes BrightStar TRAINING Kim Ollivier

Data Quality TeamData Quality Team

IT DQ Team Users

Page 25: GIS Data Quality Producing better data quality through robust business processes BrightStar TRAINING Kim Ollivier

Maintenance of featuresMaintenance of features

Time series importantTime series important Line/polygon features are not atomicLine/polygon features are not atomic Splitting loses inheritanceSplitting loses inheritance Calculating depreciation Calculating depreciation Direct editing bypasses business Direct editing bypasses business

rulesrules

Page 26: GIS Data Quality Producing better data quality through robust business processes BrightStar TRAINING Kim Ollivier

Maintenance of the QualityMaintenance of the Quality

Gardening, not mountain climbingGardening, not mountain climbing Discussion of course topicsDiscussion of course topics

Page 27: GIS Data Quality Producing better data quality through robust business processes BrightStar TRAINING Kim Ollivier

ReferencesReferences

Data Quality, Data Quality, The Accuracy DimensionThe Accuracy Dimension – Jack E – Jack E OlsonOlson

The Data Warehouse ETL Toolkit – Ralph KimballThe Data Warehouse ETL Toolkit – Ralph Kimball

Please fill in evaluation forms

Finish