46
A standard information transfer model scopes the ontologies required for observations Simon Cox, Laurent Lefort TDWG, Fremantle, 2008-09-30

A standard information transfer model scopes the ontologies required for observations Simon Cox, Laurent Lefort TDWG, Fremantle, 2008-09-30

  • View
    214

  • Download
    0

Embed Size (px)

Citation preview

A standard information transfer model scopes the ontologies required for observations

Simon Cox, Laurent Lefort TDWG, Fremantle, 2008-09-30

CSIRO Cox/TDWG 2008

Science relies on observations

• Provides evidence & validation• Involves sampling

• A domain-independent terminology and information-model• Supports data discovery and integration across discipline boundaries

• Scopes ontology developments

CSIRO Cox/TDWG 2008

What is “an Observation”

• Observation act involves a procedure applied at a specific time• The result of an observation is an estimate of some property• The observation domain is a feature of interest at some time

• After Fowler & Odell ca. 1997

CSIRO Cox/TDWG 2008

Examples

• The 7th banana weighed 270gm on the kitchen scales this morning

• The attitude of the foliation at outcrop 321 of the Leederville Formation was 63/085, measured using a Brunton on 2006-08-08

• Specimen H69 was identified on 1999-01-14 by Amy Bachrach as Eucalyptus Caesia

• The image of Camp Iota was obtained by Aster in 2003

• Sample WMC997t collected at Empire Dam on 1996-03-30 was found to have 5.6 g/T Au as measured by ICPMS at ABC Labs on 1996-05-31

• The X-Z Geobarometer determined that the ore-body was at depth 3.5 km at 1.75 Ga

• The GCM simulation run today using CMIP3 indicated that the pressure field in the atmosphere tomorrow will be as given in pf999_20081020_1

CSIRO Cox/TDWG 2008

In “pictures”

O7 :Observation

notessamplingTime="this morning"

270g :Measureweight :PropertyType

KitchenScales :Process7th Banana :

AnyFeature

featureOfInterestprocedure

observedProperty result

O-LF :Observation

notessamplingTime="2006-08-08"

63/085 :Recordfoliation orientation :

PropertyType

Brunton :ProcessLeederville

Formation :AnyFeature

featureOfInterestprocedure

observedProperty result

O_SAB :Observation

notessamplingTime="1999-01-14"

Eucalyptus Caesia :

AnyDefinition

species :PropertyType

Amy Bachrach :Process

H69 :Specimen

featureOfInterest procedure

observedProperty result

O_I345 :Observation

notessamplingTime="2003"

ASgh67c :ImageIR Radiance :PropertyType

Aster :ProcessCamp Iota :AnyFeature

featureOfInterestprocedure

observedProperty result

O_Assay689 :Observation

notessamplingTime="1996-03-30"resultTime="1996-04-16"

5.6 ppm :Measure

gold concentration :PropertyType

ICPMS :ProcessWMC997t :Specimen

Empire Dam :GeologicUnit

ABC Labs :AnyFeature

locationsampledFeature featureOfInterest

procedure

observedProperty result

O_Depth :Observation

notessamplingTime="-1.75Ga"

3.5km :Measuredepth :PropertyType

X-Z geobarometer :

Process

BF Ore Body :GeologicUnit

featureOfInterestprocedure

observedProperty result

Run999_20081020_1 :Observation

notessamplingTime="tomorrow"resultTime="today"

pf999_20081020_1 :CV_DiscreteGridPointCoverage

pressure :PropertyType

CMIP3 :ProcessGrid999 :

SamplingSolid

EarthAtmosphere

featureOfInterest

procedure

observedPropertyresult

sampledFeature

CSIRO Cox/TDWG 2008

Generic pattern for observation data

An Observation is an action whose result is an estimate of the value of some property of the feature-of-interest, obtained using a specified procedure

Where’s the “observation location”?

In the feature-of-interest - this reconciles remote, lab, and in-situ observations

OM_Observation

+ samplingTime+ resultTime [0..1]+ procedureOperator [0..1]+ parameter [0..*]+ resultQuality [0..1]

Any{n}

OF_PropertyType

OF_AnyFeature OM_Process

resultobservedProperty

1

propertyValueProvider

0..*

featureOfInterest1

generatedObservation

0..*

procedure1

Name:Package:Version:Author:

Overview: Observation«Application Schema» OM Observation Core1.0Simon Cox

Conformant with ISO 19100 CSL and meta-model

CSIRO Cox/TDWG 2008

OM_Observation

+ samplingTime+ resultTime [0..1]+ procedureOperator [0..1]+ parameter [0..*]+ resultQuality [0..1]

Any{n}

OF_PropertyType

OF_AnyFeature OM_Process

resultobservedProperty

1

propertyValueProvider

0..*

featureOfInterest1

generatedObservation

0..*

procedure1

Name:Package:Version:Author:

Overview: Observation«Application Schema» OM Observation Core1.0Simon CoxO&M vs. OBOE

An Observation is an action whose result is an estimate of the value of some property of the feature-of-interest, obtained using a specified procedure

NCEAS OBOE: An Observation is the Measurement of the Value of a Characteristic of some Entity in a particular Context

CSIRO Cox/TDWG 2008

TDWG examples

O2008_786s :Observation

E. Caesia :ScopedName

Taxon :PropertyType

FlipDibner :Process

OOc2008_786 :OrganismOccurence

O2008_786t :Observation

2008-10-18T18:17:00.00+08:00 :DateTime

Time :PropertyType

FlipDibnersWatch :Process

O2008_786x :Observation

-31.5 115,5 :DirectPosition

Location :PropertyType

FlipDibnersGPS :Process

observedProperty

featureOfInterest

procedure

result

procedure

result

observedProperty

observedProperty

result

procedurefeatureOfInterestfeatureOfInterest

Survey_2008_996 :Observation

notessamplingTime="2008-10-18"

26 :Integer

OrganismCount :PropertyType

FlipDibner :ProcessPlot_996 :

SamplingSurface

Ecosystem_abt417h

E. Caesia :ScopedName

taxon

observedPropertyresult

procedurefeatureOfInterest

sampledFeature

CSIRO Cox/TDWG 2008

Sampling strategies and relationshipsDomain feature type

Observation

SamplingFeature

+ samplingTime [0..1]+ parameter [0..*]

AnyFeature

relatedObservation

0..*

Intention

sampledFeature

SamplingPoint

Specimen

+ materialClass+ samplingMethod [0..1]+ samplingLocation [0..1]+ size [0..1]+ currentLocation [0..1]

SpatiallyExtensiveSamplingFeature

SamplingCurve

+ length [0..1]

SamplingSurface

+ area [0..1]

SamplingSolid

+ volume [0..1]

Station

Section

MapHorizonPlot

Mine

Traverse

Borehole

Traverse

0..*Complex

relatedSamplingFeature 0..*

CSIRO Cox/TDWG 2008

Specimens and “outcrops”

CSIRO Cox/TDWG 2008

What’s this got to do with Ontologies?

• UML is a formal language• UML vs. OWL … similarly expressive

• Especially if UML profile and «stereotype» used

Ontologies for observations

Obrst 2006 - Ontology Spectrum: One View

weak semanticsweak semantics

strong semanticsstrong semantics

Is Disjoint Subclass of with transitivity property

Modal Logic

Logical Theory

Thesaurus Has Narrower Meaning Than

TaxonomyIs Sub-Classification of

Conceptual Model Is Subclass of

DB Schemas, XML Schema

UML

First Order Logic

RelationalModel, XML

ER

Extended ER

Description LogicDAML+OIL, OWL

RDF/SXTM

Syntactic Interoperability

Structural Interoperability

Semantic Interoperability

From le

ss to m

ore expre

ssive

Ontologies for observations

Obrst 2006- Ontology Spectrum: One View

weak semanticsweak semantics

strong semanticsstrong semantics

Is Disjoint Subclass of with transitivity property

Modal Logic

Logical Theory

Thesaurus Has Narrower Meaning Than

TaxonomyIs Sub-Classification of

Conceptual Model Is Subclass of

DB Schemas, XML Schema

UML

First Order Logic

RelationalModel, XML

ER

Extended ER

Description LogicDAML+OIL, OWL

RDF/SXTM

Syntactic Interoperability

Structural Interoperability

Semantic Interoperability

From le

ss to m

ore expre

ssive

Problem: Very GeneralSemantic Expressivity: Very High

Problem: Local Semantic Expressivity: Low

Problem: GeneralSemantic Expressivity: Medium

Problem: GeneralSemantic Expressivity: High

CSIRO Cox/TDWG 2008

What’s this got to do with Ontologies?

• UML is a formal language• UML vs. OWL … similarly expressive

• Especially if UML profile and «stereotype» used

• ISO 19103 profile models may be transformed into OWL without too much difficulty

• ISO 19150 will define a UML→OWL rule

• … but converting the O&M model just gives you an OWL representation of the schema

• Is this useful for reasoning?

• O&M model scopes the (discipline specific) ontologies required for observational data

• and describes the relationships between them

CSIRO Cox/TDWG 2008

Discipline or community profile

• feature of interest

• Types define a domain-model(e.g. Plot, Ecosystem, OrganismOccurence)

• observed property

• Belongs to the type of the feature-of-interest (e.g. organism count, taxon, time, location)

• procedure

• Standard procedures, suitable for the property-type

OM_Observation

+ samplingTime+ resultTime [0..1]+ procedureOperator [0..1]+ parameter [0..*]+ resultQuality [0..1]

Any{n}

OF_PropertyType

OF_AnyFeature OM_Process

resultobservedProperty

1

propertyValueProvider

0..*

featureOfInterest1

generatedObservation

0..*

procedure1

Name:Package:Version:Author:

Overview: Observation«Application Schema» OM Observation Core1.0Simon Cox

• result

• Standard scales suitable for the property-type (e.g. taxonomy)

CSIRO Cox/TDWG 2008

Ontology enabled profiles

• Step one: align ontology and O&M skeleton

• Step two: round trip transformation• Transform UML model into OWL (done)• Use OWL to develop vocabularies on top of O&M skeleton• Use extended UML-based MDA process to generate XML schemas

• Motivations• Better quality vocabularies• Greater consistency of the conceptual model

CSIRO Cox/TDWG 2008

Vocabularies dependencies in O&M

VA

LUE

HO

W

WH

O

WH

AT

WHEN

WHEREIN

WHATObservationSampling Feature

Observed property

Metadata

Procedure

Result

Time*

Gen

eri

c

typ

e

Geometrical types

Fe

atu

re-

ind

ep

en

de

nt

pa

ram

ete

rs

Units

Quantities

Taxa

Chemistry

Temporal types

Ob

se

rva

tio

n

Geo

me

try

Coord. Sys

Vertical Coord. Sys

Medium

FractionP

roce

du

res

Pro

ces

sin

g

ch

ain

ty

pe Processing &

interpolation

Validation & quality flag

Sensor (Instrument)

Station Platform

SiteWater

Feature

Result type

Sampled Feature

Institution and project

System and author

Fie

ld &

La

b

me

tho

ds

Security classif.

Transaction type

Gauge/weir layout/profile

Missing data

Ob

se

rva

tio

ns

co

des

Fe

atu

rec

od

es

Fe

atu

re-

de

pen

de

nt

pa

ram

ete

rs

Feature property

Par

a-

me

ters

??

??

Survey type

Process

Action

Event

Act

ion

co

des

Eve

nt

co

de

s

Multi-dependent concepts

Feature-dep. parameters

Feature-indep. parameters

Abstract concepts

Semi-abstract concepts

Semi-primitive concepts

Primitive concepts

O&M amd GFM stereotypes

Simple classes

Classes w/ ident. instances

Onto category to be defined

Time* : two O&M stereotypes (sampling time and result time)

Features types

??

Voc

abul

arie

sU

ML

defs

CSIRO Cox/TDWG 2008

Wrap-up

CSIRO Cox/TDWG 2008

Development and validation of “O&M”

• Developed in the context of • Geochemistry/Assay data• OGC Sensor Web Enablement – environmental and remote sensing

• Subsequently applied in• Water resources/water quality• Oceans & Atmospheres • Natural resources• Taxonomic data• Geology field data

CSIRO Cox/TDWG 2008

Scopes the ontologies for domain observations

• Feature types (feature of interest, sampling features)• Observed properties• Observation procedures, instruments, algorithms• Scales, taxonomies

CSIRO Cox/TDWG 2008

O&M Status

• OGC Standard 2007• ISO 19156 – upcoming

• Key aspect of GeoSciML• Basis for WaterML v2• Basis for Climate Science ML

CSIRO Cox/TDWG 2008

Motivation for developing a common model

• Cross-domain data discovery and fusion• Re-usable service interfaces

Thank you

Exploration & MiningSimon CoxResearch Scientist

Phone: +61 8 6436 8639Email: [email protected]: www.csiro.au/em

Contact UsPhone: 1300 363 400 or +61 3 9545 2176

Email: [email protected] Web: www.csiro.au

CSIRO Cox/TDWG 2008

Generations of “standards” & integration complexity

ASCII-based

DB-based

Registries

XML

Model-driven generation of XML schemas

Custom XSL transfo. & web services

Distributed systems with same db schema

UML & XML schemas

XML schemas Reuseable XML schema stack

Master Data Managt

OWL ontologies Semantic integration

EPA STORET

EPA WQX

GWML

WDTF

WFD schemas

eWater (EU)

SANDRE

SANDRE XML

Surface water & groundwater “standards”

Integration support

Stan

dard

use

rsSt

anda

rd d

evel

oper

s

ODM

WaterML (CUAHSI)

CSIRO Cox/TDWG 2008

Observations

CSIRO Cox/TDWG 2008

Our Science is changing: scale

From small scale siloed

studies

To Integration on a global

scale

AtomMolecule

Mineral

Rock

Outcrop

Section

Mountain

Continent

Planet

Source: Office of Integrative Activities NSF

CSIRO Cox/TDWG 2008

Our Science is changing: interdisciplinary

Source: US Global Change Research Program

CSIRO Cox/TDWG 2008

Ontological value of the Observations & Measurements standard

• Two user-managed class hierarchies in GFM-based specs: • Feature and FeaturesCollection: a Feature-type is characterized by a specific set of

properties• Up to five user-managed class hierarchies in O&M-based specs

• Observation, SamplingFeature, PropertyType, Procedure and Result• An Observation is an Event whose result is an estimate of the value of some Property of the

Feature-of-interest, obtained using a specified Procedure

• Stronger ontological value for O&M• More branches and separation of concern: • Example: Difference between Feature and SamplingFeature

• Feature for the real world objects e.g. an aquifer• SamplingFeature to characterise how a measure is done e.g. along a borehole

CSIRO Cox/TDWG 2008

Normalised ontology skeleton for water observation vocabularies

Define the right branches at the top

Isolate unambiguous primitives (e.g. units)

Use modules/namespace/URIs to position source-specific definitions against common ones

CSIRO Cox/TDWG 2008

Observation data interface

• OGC “Sensor Observation Service”• http interface to sensor observations

• c.f. WFS, WCS, WMS

• Request parameters scoped by O&M model • featureOfInterest

• observedProperty

• Procedure

• Response is XML-encoded O&M

CSIRO Cox/TDWG 2008

Sampling features

CSIRO Cox/TDWG 2008

Proximate vs ultimate feature-of-interest

Ultimate (“project”) thing of interest often not directly or fully accessible

1. Proximate feature of interest embodies a sample design• Rock-specimen samples an ore-body or geologic unit• Well samples an aquifer• Profile samples an ocean/atmosphere column• Cross-section samples a rock-unit

2. Sensed property is a proxy• e.g. want land-cover, but observe colour

Some sampling designs are common across disciplines

CSIRO Cox/TDWG 2008

Examples

CSIRO Cox/TDWG 2008

Water quality of aquifers observed in wells

Ev ertsWell

Leederv illeFormation

CottesloeWedge

GnangaraMound

Interv a lEW2 :SamplingCurv e

WQ3/1 :Observ ation

WQ2/1 :Observ ation

WQ2/2 :Observ ation

WQ3/1 :Observ ation

EW/EW1 :SamplingFeatureRelation

notesrole="intervalHost"

EW/EW2 :SamplingFeatureRelation

notesrole="intervalHost"

Fooglemeter 2000 :Observ ationProcess

Farkleme ter XP :Observ ationProcess

NWC200 7/WQ :Observ ationCollection

RobsWe ll

Interv a lEW1 :SamplingCurv e

WQ3/1/r :CV_DiscreteTime InstantCov erage

WaterQuality :PropertyType

sampledFeature

relatedSamp lingFeature

sampledFeature

featureOfInterest

relatedOb servation

target

target

proce dure

observedPropertyobservedProperty

proce dure

observedProperty

proce dure

observedProperty

featureOfInterestrelatedOb servation

result

member

relatedOb servation

relatedOb servation

sampledFeature

relatedOb servation

sampledFeature

featureOfInterest

relatedOb servation

sampledFeature

relatedSamp lingFeature

featureOfInterestrelatedOb servationmember

member

memberproce dure

Nam e:Package:Version:Author:

Well&IntervalsExamples1.0Simon Cox

CSIRO Cox/TDWG 2008

Water quality measured along a ferry track

FerryTrack 20071113 :SamplingCurv e

notesshape=Curve987:GM_LineString

WQ200711113-1 :Observ ation

WQ/C-9 (Water Quality) :PropertyType

Albemarle-Pamlico Sound :AnyFeature

Fooglemeter 2000 :Observ ationProcess

WQ20071113/r1 :Rec ord

WQ200711113-2 :Observ ation

WQ20071113/r2 :Rec ord

WQ200711113-3 :Observ ation

WQ20071113/r3 :Rec ord

Station1 :Sa mplingPoint

notesposition=Point1:GM_Point

Station2 :Sa mplingPoint

notesposition=Point2:GM_Point

Station3 :Sa mplingPoint

notesposition=Point3:GM_Point

FT/S1 :SamplingFeatureRelation

notesrole="host track"

FT/S2 :SamplingFeatureRelation

notesrole="host track"

FT/S3 :SamplingFeatureRelation

notesrole="host track"

These stationsmust l ie on this track

relatedOb servation

target

sampledFeature

target

sampledFeature

featureOfInterest

relatedSamp lingFeature

target

observedProperty

featureOfInterest

proce dure

result

result

relatedOb servation

sampledFeature

relatedSamp lingFeature

sampledFeature

featureOfInterest

proce dure

observedProperty

result

proce dure

observedProperty

relatedSamp lingFeature

relatedOb servation

Nam e:Package:Version:Author:

FerrySam pling-PExamples1.0Simon Cox

FerryTrack 20071113 :SamplingCurv e

notesshape=Curve987:GM_LineString

WQ/C-9 (Water Quality) :PropertyType

Albemarle -Pamlico Sound :AnyFeature

Fooglemeter 2000 :Observ ationProcess

WQ200711113 :Observ ation

WQ20071113/r :CV_DiscretePointCov erage

sampledFeature

relatedOb servationproce dure

featureOfInterest

result

observedProperty

Nam e:Package:Version:Author:

FerrySam pling-CExamples1.0Simon Cox

CSIRO Cox/TDWG 2008

Patterns?

• Much of the interest concerns • relations between sampling features,

• associations with the domain (sampled) features

• i.e. sampling regimes are core

CSIRO Cox/TDWG 2008

Governance

CSIRO Cox/TDWG 2008

OGC Sensor Web Enablement

• OGC Web Services testbeds• OWS-1 2001 – OWS-5 2007

• Core elements of OGC SWE suite• SensorML – provider-centric information viewpoint

• O&M – consumer-centric information viewpoint

• SOS, SAS – http interface to observations

• SPS – tasking interface

• sweCommon – data-types & encodings, including coverage encoding

• TML – low-level sensor streams

CSIRO Cox/TDWG 2008

SOS

getObservation

getResult

describeSensor

getFeatureOfInterest

Accessing data using the “Observation” viewpoint

WFS/Obs

getFeature, type=Observation

WCS

getCoverage

getCoverage(result)

Sensor Register

getRecordById

WFSgetFeature

e.g. SOS::getResult == “convenience” interface for WCS

CSIRO Cox/TDWG 2008

WFS/SFS

Accessing data using the “Sampling Feature Service” viewpoint

WFSgetFeature

WCSgetCoverage

getCoverage(property value)

SOSgetObservation

Commondata

source

getFeature(sampling Feature)

getFeature(coverage property value)

getFeature(relatedObservation)

getCoverage(result)

SensorRegister

getRecordById (procedure)

getFeature(featureOfInterest)

getObservation(relatedObs)

getResult(property value)

CSIRO Cox/TDWG 2008

WFS

Accessing data using the “Domain Feature” viewpoint

WCSgetCoverage(property value)

getFeatureSOS

getResult(property value)

The “George Percivall preferred™” viewpoint #1– observations are property-value-providers for features

??

CSIRO Cox/TDWG 2008

WCS

Accessing data using the “just the data” viewpoint

WFSgetFeature/geometry(domain exent)

getCoverageSOS

getResult (lots of ‘em)(range values)

The “George Percivall preferred™” viewpoint #2 – observations are range-value-providers for coverages

CSIRO Cox/TDWG 2008

• need information transfer standards for• Geochemistry • Geochronology• Geophysics• Geodesy• Seismology• Hydrogeology• Marine• Ecology • Biogeology

• But need to coordinate these standards (including ontologies) to avoid uncontrolled growth of YAML (Yet Another Markup Language)

http://www.datastrategyjournal.com/index.php?option=com_content&task=view&id=18&Itemid=1

Application to other science disciplines

CSIRO Cox/TDWG 2008

Procedure vs. observedProperty

• observedProperty supports discovery by observation users• “show me all the observations of temperature and wind-speed”

• procedure provides strict definition• “how was that value obtained?”

• …or provider-centric discovery• “show me all the data collected by instrument X”

CSIRO Cox/TDWG 2008

Some properties vary within a feature

• colour of a Scene or Swath varies with position• shape of a Glacier varies with time• flow at a Station varies with time• rock density varies along a Borehole

• Variable values may be described as a Function on some axis of the feature

• Corresponding Observation/result is a Function • If domain is spatio-temporal, also known as coverage or map

CSIRO Cox/TDWG 2008

Variable property coverage valued result

«FeatureType»PointCov erageObserv ation

«FeatureType»Observ ation

«FeatureType»DiscreteCov erageObserv ation

CV_DiscreteElementCov erage

«FeatureType»ElementCov erageObserv ation

CV_DiscreteTimeInstantCov erageCV_DiscretePointCov erage{n}

CV_Coverage

CV_DiscreteCov erage{n}

«FeatureType»TimeSeriesObserv ation

result

result result result