17
Daniela Cristiana Docan I 6th Sept. I INSPIRE Conference 2017, Strasbourg EEA Data Quality Management supporting INSPIRE implementation

EEA Data Quality Management supporting INSPIRE ......EEA’s Data Flow stages • Source: EEA Common Workspace/Generic QA/QC • Guidelines for Reporting Obligations, XML schema, Database

  • Upload
    others

  • View
    3

  • Download
    0

Embed Size (px)

Citation preview

Page 1: EEA Data Quality Management supporting INSPIRE ......EEA’s Data Flow stages • Source: EEA Common Workspace/Generic QA/QC • Guidelines for Reporting Obligations, XML schema, Database

Daniela Cristiana Docan I 6th Sept. I INSPIRE Conference 2017, Strasbourg

EEA Data Quality Management supporting

INSPIRE implementation

Page 2: EEA Data Quality Management supporting INSPIRE ......EEA’s Data Flow stages • Source: EEA Common Workspace/Generic QA/QC • Guidelines for Reporting Obligations, XML schema, Database

Data Quality in INSPIRE

INSPIRE Technical Guidelines use ISO 19157

Geographic Information-Data quality

Page 3: EEA Data Quality Management supporting INSPIRE ......EEA’s Data Flow stages • Source: EEA Common Workspace/Generic QA/QC • Guidelines for Reporting Obligations, XML schema, Database

•Data Quality Elements mentioned in INSPIRE TG

ISO 19157 Geographic Information – Data Quality

Page 4: EEA Data Quality Management supporting INSPIRE ......EEA’s Data Flow stages • Source: EEA Common Workspace/Generic QA/QC • Guidelines for Reporting Obligations, XML schema, Database

Data Quality in INSPIRE

INSPIRE TGs cover:

1.Data Quality elements/sub-elements2. Data quality measures

(tests to be applied on dataset)3. Minimum data quality requirements/conformance quality level/

ISO 19157 Geographic Information – Data Quality

Page 5: EEA Data Quality Management supporting INSPIRE ......EEA’s Data Flow stages • Source: EEA Common Workspace/Generic QA/QC • Guidelines for Reporting Obligations, XML schema, Database

•Data Quality in INSPIRE : Protected Sites TG

Recommendations:

•DQ elements and sub-elements•Corresponding DQ measures•Minimum data quality requirements

•--

Page 6: EEA Data Quality Management supporting INSPIRE ......EEA’s Data Flow stages • Source: EEA Common Workspace/Generic QA/QC • Guidelines for Reporting Obligations, XML schema, Database

acceptance criteria or

conformance quality level

•Data Quality in INSPIRE : Protected Sites TG

Page 7: EEA Data Quality Management supporting INSPIRE ......EEA’s Data Flow stages • Source: EEA Common Workspace/Generic QA/QC • Guidelines for Reporting Obligations, XML schema, Database

Item: fields,

records, value,

features,

relationships,

files in the

dataset

package

•Data Quality in INSPIRE : Protected Sites TG

Page 8: EEA Data Quality Management supporting INSPIRE ......EEA’s Data Flow stages • Source: EEA Common Workspace/Generic QA/QC • Guidelines for Reporting Obligations, XML schema, Database

INSPIRE data quality requirements

INSPIRE:

Completeness/Omission

INSPIRE: Rate of missing item

INSPIRE: No recommendation/constrains

Page 9: EEA Data Quality Management supporting INSPIRE ......EEA’s Data Flow stages • Source: EEA Common Workspace/Generic QA/QC • Guidelines for Reporting Obligations, XML schema, Database

EEA’s Data Flow stages

•Source: EEA Common Workspace/Generic QA/QC

•Guidelines for Reporting Obligations, XML schema, Database schema, Quality control checks•A priori DQ requirements -absolute positional accuracy-10m

“what we want”

•Automatic and manually quality checks•Conformance test (minimum data quality requirements)•Metadata and/or standalone data quality report]• A posteriori DQ results/values

“what we get”

•Run automatic quality checks [QA scripts – XQuery]•Automatic QA report for MS

•ETL (Extract Transform Load) tools•Automatic and manually quality checks

Page 10: EEA Data Quality Management supporting INSPIRE ......EEA’s Data Flow stages • Source: EEA Common Workspace/Generic QA/QC • Guidelines for Reporting Obligations, XML schema, Database

•EEA’s Automatic quality report

•Source: Eionet Central Data Repository (CDR

equivalent to measures in ISO standard

Page 11: EEA Data Quality Management supporting INSPIRE ......EEA’s Data Flow stages • Source: EEA Common Workspace/Generic QA/QC • Guidelines for Reporting Obligations, XML schema, Database

•How will EEA’s quality checks connect with ISO elements & measures?

INSPIRE Requirements in INSPIRE EEA’s CDF’s quality checks

Element: Completeness-Omission ✓Mandatory values

Standardised Measure: Rate of missing

items

[Error rate] (e.g. real, percentage, ratio)

✓User defined data quality measure:

All records must have the SITE_CODE field

filled

Minimum data quality requirements:

None

Conformance test: acceptance criteria is

[0%] errors in the dataset

Source: Eionet Central Data repository (CDR)

Page 12: EEA Data Quality Management supporting INSPIRE ......EEA’s Data Flow stages • Source: EEA Common Workspace/Generic QA/QC • Guidelines for Reporting Obligations, XML schema, Database

Data Quality Rule Registry (DQRR)• [catalogue of standardized and user defined data quality measures]

= Measures in ISO 19157

= Elements in ISO 19157

Page 13: EEA Data Quality Management supporting INSPIRE ......EEA’s Data Flow stages • Source: EEA Common Workspace/Generic QA/QC • Guidelines for Reporting Obligations, XML schema, Database

Data Quality Rule Registry (DQRR)

Page 14: EEA Data Quality Management supporting INSPIRE ......EEA’s Data Flow stages • Source: EEA Common Workspace/Generic QA/QC • Guidelines for Reporting Obligations, XML schema, Database

•Data quality checks – different point of views

1.“Minimal mapping unit”

[the smallest size of area allowed to be represented in a given data set]

Topological consistency or conceptual consistency?

2. “C34 – Coordinate accuracy”

These are required to be in format the ETRS89 (2D)-EPSG:4258 coordinate reference system,

with a 10m accuracy. Hence a check is required to ensure that, when coordinates are

reported, each coordinate is to 4 decimal places, adhering to the 10m accuracy required.”

CDF’s guideline

The number of decimal places for decimal degrees coordinates

☺ Precision or resolution Not absolute positional accuracy

Source: www.ncetm.org.uk

Logical consistency-Conceptual consistency

or Format consistency?

The error vector for a single point,

(Source: Weir et al., 2001, p.413)

Page 15: EEA Data Quality Management supporting INSPIRE ......EEA’s Data Flow stages • Source: EEA Common Workspace/Generic QA/QC • Guidelines for Reporting Obligations, XML schema, Database

Conclusions

INSPIRE Data Specification requirements/recommendations

Data Specifications are not restrictive on the data quality (e.g. elements to be covered, measure (tests) to be applied, or minimum data quality requirements)

Consistency in defining data quality elements and measures (tests) across different annexes or/and themes

(e.g. Annex III - Topological consistency and Temporal consistency and validity)

Page 16: EEA Data Quality Management supporting INSPIRE ......EEA’s Data Flow stages • Source: EEA Common Workspace/Generic QA/QC • Guidelines for Reporting Obligations, XML schema, Database

•Conclusions

EEA’s QA/QC workflow

Fulfils the INSPIRE requirements on data qualityNew components of the data quality will be covered/improved (e.g. absolute positional accuracy)Data Quality Rule Register (DQRR) project promote the “interoperability of the data quality” by proposing:

•Common criteria to categorise data quality checks across EEA’s production stages•Harmonize DQ terminology across different Core Data Flows

(e.g. record uniqueness, duplicate elements, duplicates entries, duplicate value, duplicities, and uniqueness of primary key)

•Assign/link the existing DQ checks to ISO quality elements and sub-elements•Harmonise the Standalone Data Quality Reports

Page 17: EEA Data Quality Management supporting INSPIRE ......EEA’s Data Flow stages • Source: EEA Common Workspace/Generic QA/QC • Guidelines for Reporting Obligations, XML schema, Database

•Q&A

Thank you,

Daniela Cristiana Docan

[email protected]