32
co-funded by the European Community eContentplus programme The NATURE-SDIplus Validation methodology

Co-funded by the European Community eContentplus programme The NATURE-SDIplus Validation methodology

Embed Size (px)

Citation preview

Page 1: Co-funded by the European Community eContentplus programme The NATURE-SDIplus Validation methodology

co-funded by the European Community eContentplus programme

The NATURE-SDIplus Validation methodology

Page 2: Co-funded by the European Community eContentplus programme The NATURE-SDIplus Validation methodology

2

Overall testing and validation approach Validation of data specification and data

encoding Data accessibility and usability testing Data Quality evaluation Data generalisation

PRESENTATION OUTLINE

Page 3: Co-funded by the European Community eContentplus programme The NATURE-SDIplus Validation methodology

3

NATURE-SDIplus MAIN OUTCOMES

Harmonised DS & MD(PS, BR, HB, SD)

GEOPORTAL (network services)

Data Models for 3 Annex III themes

(BR, HB, SD)

NatSDI MD profile(s)

Page 4: Co-funded by the European Community eContentplus programme The NATURE-SDIplus Validation methodology

4

NATURE-SDIplus DATASETS AND METADATA

Harmonised DS + MD(PS, BR, HB, SD)

after harmonisation

DS + MD(PS, BR, HB, SD)

before harmonisation

Before Task 4.1

After Task 4.1

Page 5: Co-funded by the European Community eContentplus programme The NATURE-SDIplus Validation methodology

5

WP5: TASKS AND INTER-RELATIONSHIPS

INSPIRE validation T5.1

Test on data accessibility &

usabilityT5.2

Quality evaluation

and dataset generalisation

T5.3

Page 6: Co-funded by the European Community eContentplus programme The NATURE-SDIplus Validation methodology

6

Generic validation process Covers both

Validation of specification encoding Validation of data encoding

NATURE-SDIplus specifications and test data as examples

NATURE-SDIplus VALIDATION METHODOLOGY

Page 7: Co-funded by the European Community eContentplus programme The NATURE-SDIplus Validation methodology

7

VALIDATION OF SPECIFICATION ENCODING

The required steps: Validate Schema Check transposition of

specification Check validatability

The process:

Page 8: Co-funded by the European Community eContentplus programme The NATURE-SDIplus Validation methodology

8

The process:

METADATA VALIDATION

The required steps:

Syntactic validationSemantic validation

Page 9: Co-funded by the European Community eContentplus programme The NATURE-SDIplus Validation methodology

9

DATA VALIDATION

The process:

The required steps:

Syntactic validationSemantic validation

Page 10: Co-funded by the European Community eContentplus programme The NATURE-SDIplus Validation methodology

10

VALIDATION BRIEFCASE OVERVIEW

Page 11: Co-funded by the European Community eContentplus programme The NATURE-SDIplus Validation methodology

11

Validation Reports not uploaded;

122; 57%

Validation Reports uploaded; 93;

43%

Harmonisation completed; 215;

92%

Harmonisation in progress; 19; 8%

RESULTS OF THE USE OF THE VALIDATION BRIEFCASE

Validation Completed; 63;

29%

Other reasons (gml not uploaded, etc.);

13; 9,4%Invalid file format (shp); 46; 33,3%

Invalid file format (gml no 3.2.1); 79;

57,2%

Validation Not Completed; 14; 7%

Validation Not Applicable; 138;

64%

both Schema Validation and

INSPIRE theme Schematron

Validation NCs; 4; 13%

only INSPIRE theme Schematron Validation NCs; 4;

13%

only Schema Validation NCs; 23;

74%

with Non Conformities; 31;

49,2%

without Non Conformities; 32;

50,8%

Page 12: Co-funded by the European Community eContentplus programme The NATURE-SDIplus Validation methodology

12

HALE; 10; 15,9%

Geoconverter; 8; 12,7%

FME,XmlSpy,deegree; 1; 1,6%

GO Publisher; 30; 47,6%

Arc GIS Desktop 9.3, Quantum GIS, OGR tools, Altova

MapForce, Oxygen XML Editor,

(Humboldt tools); 14; 22,2%

STATISTICS OF THE REMODELLING Tools used to hamonise the 63 datasets for which the validation has been completed

Page 13: Co-funded by the European Community eContentplus programme The NATURE-SDIplus Validation methodology

13

ASSESSING DATA ACCESIBILITY

Level I Criteria Level II Criteria Test Item Test Method

Discoverability Data Search, Multilingualism and Semantic search, Explore Metadata Details, Panning, Zooming and Exploring Feature Information

Geoportal functionalities enabling search of metadata and access to harmonised datasets

Evaluate using the test criterion in section 3.1

Retrievability Retrievability - Retrieving Spatial Features, Downloading GML

Geoportal functionalities enabling the download of GML as zip files

Evaluate using the test criterion in section 3.1

Exploitability Performance, Availability, Reliability, Compliance, Security

Portal and download services offered by the geoportal

Evaluate using the test criterion in section 3.1

Page 14: Co-funded by the European Community eContentplus programme The NATURE-SDIplus Validation methodology

14

ASSESSING DATA USABILITY

STEP 1: Design of online questionnaires

STEP 2: Distribution and survey

STEP 3: Result gathering and analysis

STEP 4: Reporting

Page 15: Co-funded by the European Community eContentplus programme The NATURE-SDIplus Validation methodology

15

DATA USABILITY ON-LINE QUESTIONNAIRE (1/2)

First part to collect info about the user extent of the geographical AOI used group of stakeholders belonging to type of professional activity / field of expertise Data theme assessed key-words used during data search

Page 16: Co-funded by the European Community eContentplus programme The NATURE-SDIplus Validation methodology

16

DATA USABILITY ON-LINE QUESTIONNAIRE (2/2) Second part to collect info about how data relevant to a given theme

are usable: within the geoportal (using its functionalities) outside the geoportal (downloading the data via the geoportal

and using them inside your application, and/or consuming the wms/wfs directly in your application).

The user is asked to rate as poor or moderate or good or excellent her/his level of satisfaction of using:

the overall Geoportal functionalities the specific search functionalities the data within the Geoportal the data outside the Geoportal

Built using Google docs tools

Page 17: Co-funded by the European Community eContentplus programme The NATURE-SDIplus Validation methodology

28/06/2011 17INSPIRE Conference 2011

QUSTIONNAIRES PROCESSING (1/4)

Page 18: Co-funded by the European Community eContentplus programme The NATURE-SDIplus Validation methodology

18

QUSTIONNAIRES PROCESSING (2/4)

Page 19: Co-funded by the European Community eContentplus programme The NATURE-SDIplus Validation methodology

19

QUSTIONNAIRES PROCESSING (3/4)

Page 20: Co-funded by the European Community eContentplus programme The NATURE-SDIplus Validation methodology

20

QUSTIONNAIRES PROCESSING (4/4)

Page 21: Co-funded by the European Community eContentplus programme The NATURE-SDIplus Validation methodology

DATA QUALITY EVALUATION

Main objective of the task 5.3 in terms of quality evaluation: to assess the quality of the harmonised vs. the source datasets

A four steps methodology has been developed and applied

Page 22: Co-funded by the European Community eContentplus programme The NATURE-SDIplus Validation methodology

DATA QUALITY EVALUATION METHODOLOGY

Step 1deep analysis of the background documentation:

the international standards EN ISO 19113, 19114, ISO/TS 19138the data quality issues covered by INSPIREthe NatureSDIplus Metadata profile

from which the data quality elements and subelements, together with the corresponding measures and their reporting have been extracted

Step 2Elaboration of a set of guidelines enabling the quality evaluation of spatial datasets belonging to the four INSPIRE themes covered by NatureSDIplus (PS, BR, HB, SD)

Step 3Adaptation of the step 2 guidelines in order to use the selected data quality elements and subelements to assess the quality of the NatureSDIplus harmonised vs. source datasets

Step 4Application of the step 3 guidelines to 4 harmonised datasets (1 harmonised dataset for each of the four INSPIRE themes – PS, BR, HB, SD) and reporting of the quality evaluation results

Page 23: Co-funded by the European Community eContentplus programme The NATURE-SDIplus Validation methodology

23

QUALITY OF DS & MD

EN ISO 19113 Geographic Information – Quality principlesEN ISO 19114 Geographic Information – Quality evaluation proceduresTS ISO 19138 Geographic information – Data quality measuresEN ISO 19115 Geographic Information – Metadata

DQMD

INSPIRE DS Req’s and

Rec’s

Page 24: Co-funded by the European Community eContentplus programme The NATURE-SDIplus Validation methodology

Select the DQ elements and sub-elements(cross-checking INSPIRE PS Data Specifications,

NatureSDIplus MD profile and EN ISO 19113)

For each sub-element define a DQ measure(in adherence to ISO/TS 19138)

For each sub-element define a DQ reporting(in adherence to EN ISO 19114 and EN ISO 19115)

For each sub-element provide an example of DQ evaluation

METHODOLOGY FOLLOWED TO DEVELOP THE GUIDELINES FOR DQ EVALUATION

Page 25: Co-funded by the European Community eContentplus programme The NATURE-SDIplus Validation methodology

Data quality element Data quality sub-element

Covered by INSPIRE specification for Protected sites

Covered by NatureSDI+ Metadata profile

Completeness Commission Optional - PS Optional - PS

Omission Optional - PS Optional - PS, BR, HB, SD

Positional accuracy Absolute or external accuracy

Optional - PS Optional - PS

Temporal accuracy Accuracy of a time measurement

Optional –BR, HB, SD

Temporal consistency Optional –BR, HB, SD

Thematic accuracy Classification correctness

Optional –BR, HB, SD

Quantitative attribute correctness

Optional –BR, HB, SD

DATA QUALITY ELEMENTS AND SUB-ELEMENTS

Page 26: Co-funded by the European Community eContentplus programme The NATURE-SDIplus Validation methodology

Data quality component

Data quality scope All items of PS datasets of CountryX

Data quality element Completeness Completeness

Data quality subelement Commission Omission

Data quality measure

Data quality measure description Rate of excess items Rate of missing items

Data quality basic measure Error rate Error rate

Data quality measure identification code

3 (ISO/TS 19138) 7 (ISO/TS 19138)

Data quality evaluation method

Data quality evaluation method type External External

Data quality evaluation method description

Number of excess items in the dataset in relation to the number of items that should have been present

Number of missing items in the dataset in relation to the number of items that should have been present

Data quality result

Data quality value type Percentage Ratio

Data quality value 0% 20:500

Data quality value unit - -

Data quality date 2011-02-01 2011-02-02

Conformance quality level Zero items Zero items

Dataset parameters 0 excess items are present in the harmonised dataset; 480 items are present in the dataset.

480 items in dataset are within the data quality scope; 500 items in the universe of discourse are within the scope.

Quality result meaning Dataset pass. No excess items exist.

Dataset fails. The number of missing items in the dataset exceeds the data quality conformance quality level..

DATA QUALITY EVALUATION REPORTING

Page 27: Co-funded by the European Community eContentplus programme The NATURE-SDIplus Validation methodology

The Data Quality elements and subelements have been structured according to the EN ISO 19115 formalisms, enabling their eventual future encoding as metadata according to the CEN ISO/TS 19139

The results achieved can be easily applied also to the other data themes, therefore providing a basis for Data Quality issues in the INSPIRE context

DATA QUALITY EVALUATION ADDITIONAL RESULTS

Page 28: Co-funded by the European Community eContentplus programme The NATURE-SDIplus Validation methodology

DATASETS GENERALISATION

Main objective: to assess issues related to datasets generalisation from the local level to the national/European level

Method: design of an off-line questionnaire to collect the feedback of the NatureSDIplus Data Providers (DPs) about the usability of the PS, BR, HB and SD Data Models and of the NatureSDIplus Metadata Profile when harmonising data and metadata at local level and aiming at generalising them from the local to the national/European level.

Page 29: Co-funded by the European Community eContentplus programme The NATURE-SDIplus Validation methodology

DATASETS GENERALISATION QUESTIONNAIRE In particular, the feedback focused on two main aspects:

if DPs have noticed the need/opportunity to extend/modify the target data models, in order to better take into account local aspects

if DPs have noticed the need/opportunity to extend/modify the source data models, in order to facilitate INSPIRE compliance.

The first aspect is coherent with the Annex F (Example for an extension to an INSPIRE application schema) of the INSPIRE Data Specification D2.5 Generic Conceptual Model, according to which the INSPIRE data specifications can be modified at local level, in terms of data model, in order to take into account local aspects.

The feedback collected on the second aspect can support local communities engaged in implementing INSPIRE.

Page 30: Co-funded by the European Community eContentplus programme The NATURE-SDIplus Validation methodology

DATASETS GENERALISATION QUESTIONNAIRE MAIN RESULTS

19 questionnaires filled-in by 19 different DPs, replies analysed and results processed

Yes, I noticed the need/opportunity;

7; 37%

No;12; 63%

No; 3; 16%

Yes, I noticed the need/opportunity;

16; 84%

Need/opportunity to extend/modify the target data models, in order to better take

into account local aspects

Need/opportunity to extend/modify the source data

models, in order to facilitate INSPIRE compliance

Page 31: Co-funded by the European Community eContentplus programme The NATURE-SDIplus Validation methodology

DATASETS GENERALISATION FEEDBACK (1/2)

Some feedback about the need/opportunity to extend/modify the target data models, in order to better take into account local aspects: “The information contained in the source PS dataset site Protection

Classification is divided to 20 values. In NATURE-SDIplus data model 7 values. It is difficult to select the suitable one”

“We noticed that the Habitats and Biotopes target data Model doesn’t cover the whole information we have describing each habitats. Our information relates several habitats to one geographical feature and the target data model expects only the relation 1 – 1. Other ways, like duplicating geographical information could be taken into account but from our point of view is not the best solution in the future.”

“Metadata: I would leave out Data quality - Thematic and temporal accuracy and Acquisition method. The latter because it can be described already in the lineage part. Data models: BGR: I would leave out the detailed class description parameters such as temperature, rainfall, etc… HB: Also here I would leave out a number of attribute such as elevation, activities and impacts, development Stage, monitoring Assessment.”

Page 32: Co-funded by the European Community eContentplus programme The NATURE-SDIplus Validation methodology

DATASETS GENERALISATION FEEDBACK (2/2)

Some feedback about the need/opportunity to extend/modify the source data models, in order to facilitate INSPIRE compliance: “Some source datasets are missing a lot of mandatory

information to be INSPIRE compliant. The need to restructure the datasets into a database (and not a collection of flat files) is crucial for some data providers”

“The information contained in the source dataset is not sufficient to populate the corresponding attributes of the target data model. E.g.: The attribute ‘MANAGPL’ of the source dataset contains, for some sites, information about the type of site management, whilst it should contain the URL or citation of a document describing the site management plans. Moreover, for other sites, the attribute contain references to many documents.

“Our datasets are simple shapefiles with attributes.”