214
Publishing and Interlinking Linked Geospatial Data Tutorial in Conjunction with the 12th Extended Semantic Web Conference http:// event.cwi.nl/eswc2015-geo/

ESWC2015 - Tutorial on Publishing and Interlinking Linked Geospatial Data

Embed Size (px)

Citation preview

Page 1: ESWC2015 - Tutorial on Publishing and Interlinking Linked Geospatial Data

Publishing and Interlinking Linked Geospatial Data

Tutorial in Conjunction with the

12th Extended Semantic Web Conference

http://event.cwi.nl/eswc2015-geo/

Page 2: ESWC2015 - Tutorial on Publishing and Interlinking Linked Geospatial Data

Tutorial organization

9:00-9:15 Introduction9:15-10:30 Background in geospatial data modeling, representinggeospatial information in the Semantic Web, and querying linked geospatial data.10:30-11:00 coffee break11:00-12:00 Publishing geospatial information as RDF graphs12:00-12:30 Discovering Spatial and Temporal Links among RDF graphs12:30-14:00 Lunch break14:00-14:30 Discovering Spatial and Temporal Links among RDF graphs14:30-15:30 Hands-on session: Publishing geospatial information as RDF graphs15:30-16:00 coffee break16:00-17:00 Hands-on session: Discovering Spatial and Temporal Links among RDF graphs17:00-17:10 Conclusions

http://event.cwi.nl/eswc2015-geo/

Page 3: ESWC2015 - Tutorial on Publishing and Interlinking Linked Geospatial Data

Part 1:

Background in geospatial data

modeling

ESWC 2015 Tutorial

Publishing and Interlinking Linked Geospatial Data

Dept. of Informatics and TelecommunicationsNational and Kapodistrian University of Athens

Page 4: ESWC2015 - Tutorial on Publishing and Interlinking Linked Geospatial Data

ESWC 2015 Tutorial 2

Outline

• Basic GIS concepts and terminology

• Representing geometries

• Representing topological information

• Geospatial data standards

Page 5: ESWC2015 - Tutorial on Publishing and Interlinking Linked Geospatial Data

ESWC 2015 Tutorial 3

Basic GIS Concepts and Terminology

• Theme: the information corresponding to a particular domain

that we want to model. A theme is a set of geographic

features.

• Example: the countries of Europe

Page 6: ESWC2015 - Tutorial on Publishing and Interlinking Linked Geospatial Data

ESWC 2015 Tutorial 4

Basic GIS Concepts (cont’d)

• Geographic feature or geographic object: a domain entity

that can have various attributes that describe spatial and non-

spatial characteristics.

• Example: the country Greece with attributes

• Population

• Flag

• Capital

• Geographical area

• Coastline

• Bordering countries

Page 7: ESWC2015 - Tutorial on Publishing and Interlinking Linked Geospatial Data

ESWC 2015 Tutorial 5

Basic GIS Concepts (cont’d)

• Geographic features can be atomic or complex.

• Example: According to the Kallikratis administrative reform of

2010, Greece consists of:

• 13 regions (e.g., Crete)

• Each region consists of regional units (e.g., Heraklion)

• Each regional unit consists of municipalities (e.g.,

Dimos Chersonisou)

• …

Page 8: ESWC2015 - Tutorial on Publishing and Interlinking Linked Geospatial Data

ESWC 2015 Tutorial 6

Basic GIS Concepts (cont’d)

• The spatial characteristics of a feature can involve:

• Geometric information (location in the underlying

geographic space, shape etc.)

• Topological information (containment, adjacency etc.).

Municipalities of the regional unit of Heraklion:1. Dimos Irakliou2. Dimos Archanon-Asterousion3. Dimos Viannou4. Dimos Gortynas5. Dimos Maleviziou6. Dimos Minoa Pediadas7. Dimos Festou8. Dimos Chersonisou

Page 9: ESWC2015 - Tutorial on Publishing and Interlinking Linked Geospatial Data

ESWC 2015 Tutorial 7

Geometric Information

• Geometric information can be captured by using geometric primitives

(points, lines, polygons, etc.) to approximate the spatial attributes of

the real world feature that we want to model.

• Geometries are associated with a coordinate reference system which

describes the coordinate space in which the geometry is defined.

Page 10: ESWC2015 - Tutorial on Publishing and Interlinking Linked Geospatial Data

ESWC 2015 Tutorial 8

Encoding Geometries: Vector Representation

• In this encoding objects in space are represented using points as

primitives as follows:

• A point is represented by a tuple of coordinates.

• A line segment is represented by a pair with its beginning

and ending point.

• More complex objects such as arbitrary lines, curves,

surfaces etc. are built recursively by the basic primitives

using constructs such as lists, sets etc.

• This is the approach used in all GIS and other popular

systems today. It has also been standardized by various

international bodies.

Page 11: ESWC2015 - Tutorial on Publishing and Interlinking Linked Geospatial Data

ESWC 2015 Tutorial 9

Example

[(1,2) (2,2) (5,3) (3,1) (2,1) (1,2)]

Page 12: ESWC2015 - Tutorial on Publishing and Interlinking Linked Geospatial Data

ESWC 2015 Tutorial 10

Encoding Geometries: Constraint Representation

• In this case objects in space are represented by quantifier free

formulas in a constraint language (e.g., linear constraints).

)3

4

353()124()223(

xyxyyxxyyxxy

Page 13: ESWC2015 - Tutorial on Publishing and Interlinking Linked Geospatial Data

ESWC 2015 Tutorial 11

Constraint Databases

• The constraint representation of spatial data was the focus of

much work in databases, logic programming and AI after the

paper by Kanellakis, Kuper and Revesz (PODS, 1991).

• The approach was very fruitful theoretically but was not adopted

in practice.

Page 14: ESWC2015 - Tutorial on Publishing and Interlinking Linked Geospatial Data

ESWC 2015 Tutorial 12

Topological Information

• Topological information is inherently qualitative and it is

expressed in terms of topological relations (e.g., containment,

adjacency, overlap etc.).

• Topological information can be derived from geometric

information or it might be captured by asserting explicitly the

topological relations between features.

Page 15: ESWC2015 - Tutorial on Publishing and Interlinking Linked Geospatial Data

ESWC 2015 Tutorial 13

Topological Relations

• The study of topological relations has produced

a lot of interesting results by researchers in:

• GIS

• Spatial databases

• Artificial Intelligence (qualitative reasoning

and knowledge representation)

Page 16: ESWC2015 - Tutorial on Publishing and Interlinking Linked Geospatial Data

ESWC 2015 Tutorial 14

DE-9IM

• The dimensionally extended 9-intersection model

(DE-9IM) of Clementini and Felice.

• It is based on the point-set topology of R2.

• It deals with simple, closed and connected

geometries (areas, lines, points).

• It is an extension of earlier approaches: the 4-

intersection (4IM) and 9-intersection (9IM)

models by Egenhofer and colleagues.

Page 17: ESWC2015 - Tutorial on Publishing and Interlinking Linked Geospatial Data

ESWC 2015 Tutorial 15

Topological Relations in DE-9IM

• It captures topological relationships between two

geometries a and b in R2 by considering the

dimensions of the intersections of the

boundaries, interiors and exteriors of the two

geometries:

• The dimension can be 2, 1, 0 and -1 (dimension of

the empty set).

Page 18: ESWC2015 - Tutorial on Publishing and Interlinking Linked Geospatial Data

ESWC 2015 Tutorial 16

Example

I(C) B(C) E(C)

I(A) -1 -1 2

B(A) -1 -1 1

E(A) 2 1 2

A

C

Page 19: ESWC2015 - Tutorial on Publishing and Interlinking Linked Geospatial Data

ESWC 2015 Tutorial 17

Topological Relations in DE-9IM

• The following five named relationships between two different

geometries can be distinguished: disjoint, touches, crosses,

within and overlaps.

• The named relationships have a reasonably intuitive meaning

for users. They are jointly exclusive and pairwise disjoint

(JEPD).

• The model can also be defined using an appropriate calculus of

geometries that uses these 5 binary relations and boundary

operators.

Page 20: ESWC2015 - Tutorial on Publishing and Interlinking Linked Geospatial Data

ESWC 2015 Tutorial 18

Example: A disjoint C

I(C) B(C) E(C)

I(A) F F *

B(A) F F *

E(A) * * *

A

C

Notation: • T = { 0, 1, 2 }• F = -1 • * = don’t care = { -1, 0, 1, 2 }

Page 21: ESWC2015 - Tutorial on Publishing and Interlinking Linked Geospatial Data

ESWC 2015 Tutorial 19

Example: A within C

I(C) B(C) E(C)

I(A) T * F

B(A) * * F

E(A) * * *

C

A

Notation equivalent to 3x3 matrix:

• String of 9 characters representing the above matrix in row major order.

• In this case: T*F**F***

Page 22: ESWC2015 - Tutorial on Publishing and Interlinking Linked Geospatial Data

ESWC 2015 Tutorial 20

DE-9IM Relation Definitions

Page 23: ESWC2015 - Tutorial on Publishing and Interlinking Linked Geospatial Data

ESWC 2015 Tutorial 21

The Region Connection Calculus (RCC)

• The primitives of the calculus are spatial regions. These are

non-empty, regular closed subsets of a topological space.

• The calculus is based on a single binary predicate C that

formalizes the “connectedness” relation.

• C(a,b) is true when the closure of a is connected to the

closure of b i.e., they have at least one point in common.

• It is axiomatized using first order logic.

Page 24: ESWC2015 - Tutorial on Publishing and Interlinking Linked Geospatial Data

ESWC 2015 Tutorial 22

RCC-8

• This is a set of eight JEPD binary relations that can

be defined in terms of predicate C.

Page 25: ESWC2015 - Tutorial on Publishing and Interlinking Linked Geospatial Data

ESWC 2015 Tutorial 23

RCC-5

• The RCC-5 subset has also been studied. The

granularity here is coarser. The boundary of a region is

not taken into consideration:

• No distinction among DC and EC, called just DR.

• No distinction among TPP and NTPP, called just

PP.

• RCC-8 and RCC-5 relations can also be defined

using point-set topology, and there are very close

connections to the models of Egenhofer and others.

Page 26: ESWC2015 - Tutorial on Publishing and Interlinking Linked Geospatial Data

ESWC 2015 Tutorial 24

More Qualitative Spatial Relations

• Orientation/Cardinal directions (left of, right of,

north of, south of, northeast of etc.)

• Distance (close to, far from etc.). This information

can also be quantitative.

Page 27: ESWC2015 - Tutorial on Publishing and Interlinking Linked Geospatial Data

ESWC 2015 Tutorial 25

Coordinate Systems

• Coordinate: one of n scalar values that determines the position

of a point in an n-dimensional space.

• Coordinate system: a set of mathematical rules for specifying

how coordinates are to be assigned to points.

• Example: the Cartesian coordinate system

Page 28: ESWC2015 - Tutorial on Publishing and Interlinking Linked Geospatial Data

ESWC 2015 Tutorial 26

Coordinate Reference Systems

• Coordinate reference system: a coordinate system

that is related to an object (e.g., the Earth, a planar

projection of the Earth, a three dimensional

mathematical space such as R3) through a datum

which specifies its origin, scale, and orientation.

• The term spatial reference system is also used.

Page 29: ESWC2015 - Tutorial on Publishing and Interlinking Linked Geospatial Data

ESWC 2015 Tutorial 27

Geographic Coordinate Reference Systems

• These are 3-dimensional coordinate systems that utilize latitude

(φ), longitude (λ) , and optionally geodetic height (i.e.,

elevation), to capture geographic locations on Earth.

Page 30: ESWC2015 - Tutorial on Publishing and Interlinking Linked Geospatial Data

ESWC 2015 Tutorial 28

The World Geodetic System

• The World Geodetic System (WGS) is the most well-known

geographic coordinate reference system and its latest revision is

WGS84.

• Applications: cartography, geodesy, navigation (GPS), etc.

Page 31: ESWC2015 - Tutorial on Publishing and Interlinking Linked Geospatial Data

ESWC 2015 Tutorial 29

Projected Coordinate Reference Systems

• Projected coordinate reference systems: they transform the

3-dimensional approximation of the Earth into a 2-dimensional

surface (distortions!)

• Example: the Universal Transverse Mercator (UTM) system

Page 32: ESWC2015 - Tutorial on Publishing and Interlinking Linked Geospatial Data

ESWC 2015 Tutorial 30

Coordinate Reference Systems (cont’d)

• There are well-known ways to translate between co-

ordinate reference systems.

• See the list of coordinate reference systems of the

European Petroleum Survey Group: http://www.epsg-

registry.org/

Page 33: ESWC2015 - Tutorial on Publishing and Interlinking Linked Geospatial Data

ESWC 2015 Tutorial 31

Geospatial Data Standards

• The Open Geospatial Consortium (OGC) and the

International Organization for Standardization (ISO) have

developed many geospatial data standards that are in wide use

today. In this tutorial we will cover:

• Well-Known Text

• Geography Markup Language

• OpenGIS Simple Features Access

Page 34: ESWC2015 - Tutorial on Publishing and Interlinking Linked Geospatial Data

ESWC 2015 Tutorial 32

Well-Known Text (WKT)

• WKT is an OGC and ISO standard for representing geometries,

coordinate reference systems, and transformations between

coordinate reference systems.

• WKT is specified in OpenGIS Simple Feature Access - Part 1:

Common Architecture standard which is the same as the ISO 19125-1

standard. Download from

http://portal.opengeospatial.org/files/?artifact_id=25355 .

• This standard concentrates on simple features: features with all

spatial attributes described piecewise by a straight line or a

planar interpolation between sets of points.

Page 35: ESWC2015 - Tutorial on Publishing and Interlinking Linked Geospatial Data

ESWC 2015 Tutorial 33

WKT Class Hierarchy

Page 36: ESWC2015 - Tutorial on Publishing and Interlinking Linked Geospatial Data

ESWC 2015 Tutorial 34

Example

WKT representation:

GeometryCollection(

Point(5 35),

LineString(3 10,5 25,15 35,20 37,30 40),

Polygon((5 5,28 7,44 14,47 35,40 40,20 30,5 5),

(28 29,14.5 11,26.5 12,37.5 20,28 29))

)

Page 37: ESWC2015 - Tutorial on Publishing and Interlinking Linked Geospatial Data

ESWC 2015 Tutorial 35

Geography Markup Language (GML)

• GML is an XML-based encoding standard for the

representation of geospatial data.

• GML provides XML schemas for defining a variety of concepts:

geographic features, geometry, coordinate reference

systems, topology, time and units of measurement.

• GML profiles are subsets of GML that target particular

applications.

• Examples: Point Profile, GML Simple Features Profile etc.

Page 38: ESWC2015 - Tutorial on Publishing and Interlinking Linked Geospatial Data

ESWC 2015 Tutorial 36

GML Simple Features: Class Hierarchy

Page 39: ESWC2015 - Tutorial on Publishing and Interlinking Linked Geospatial Data

ESWC 2015 Tutorial 37

Example

GML representation:

<gml:Polygon gml:id="p3" srsName="urn:ogc:def:crs:EPSG:6.6:4326”>

<gml:exterior>

<gml:LinearRing>

<gml:coordinates>

5,5 28,7 44,14 47,35 40,40 20,30 5,5

</gml:coordinates>

</gml:LinearRing>

</gml:exterior>

</gml:Polygon>

Page 40: ESWC2015 - Tutorial on Publishing and Interlinking Linked Geospatial Data

ESWC 2015 Tutorial 38

OpenGIS Simple Features Access

• OGC has also specified a standard for the storage, retrieval,

query and update of sets of simple features using

relational DBMS and SQL.

• This standard is “OpenGIS Simple Feature Access - Part 2: SQL

Option” and it is the same as the ISO 19125-2 standard. Download from

http://portal.opengeospatial.org/files/?artifact_id=25354.

• Related standard: ISO 13249 SQL/MM - Part 3.

Page 41: ESWC2015 - Tutorial on Publishing and Interlinking Linked Geospatial Data

ESWC 2015 Tutorial 39

OpenGIS Simple Features Access (cont’d)

• The standard covers two implementations options: (i) using only

the SQL predefined data types and (ii) using SQL with

geometry types.

• SQL with geometry types:

• We use the WKT geometry class hierarchy presented earlier

to define new geometric data types for SQL

• We define new SQL functions on those types.

Page 42: ESWC2015 - Tutorial on Publishing and Interlinking Linked Geospatial Data

ESWC 2015 Tutorial 40

SQL with Geometry Types -Functions

• Functions that request or check properties of a geometry:

• ST_Dimension(A:Geometry):Integer

• ST_GeometryType(A:Geometry):Character Varying

• ST_AsText(A:Geometry): Character Large Object

• ST_AsBinary(A:Geometry): Binary Large Object

• ST_SRID(A:Geometry): Integer

• ST_IsEmpty(A:Geometry): Boolean

• ST_IsSimple(A:Geometry): Boolean

Page 43: ESWC2015 - Tutorial on Publishing and Interlinking Linked Geospatial Data

ESWC 2015 Tutorial 41

SQL with Geometry Types –Functions (cont’d)

• Functions that test topological relations between two geometries

using the DE-9IM:

• ST_Equals(A:Geometry, B:Geometry):Boolean

• ST_Disjoint(A:Geometry, B:Geometry):Boolean

• ST_Intersects(A:Geometry, B:Geometry):Boolean

• ST_Touches(A:Geometry, B:Geometry):Boolean

• ST_Crosses(A:Geometry, B:Geometry):Boolean

• ST_Within(A:Geometry, B:Geometry):Boolean

• ST_Contains(A:Geometry, B:Geometry):Boolean

• ST_Overlaps(A:Geometry, B:Geometry):Boolean

• ST_Relate(A:Geometry, B:Geometry, Matrix: Char(9)):Boolean

Page 44: ESWC2015 - Tutorial on Publishing and Interlinking Linked Geospatial Data

ESWC 2015 Tutorial 42

DE-9IM Relation Definitions

• A equals B can also be defined by the pattern TFFFTFFFT.

• A intersects B is the negation of A disjoint B

• A contains B is equivalent to B within A

Page 45: ESWC2015 - Tutorial on Publishing and Interlinking Linked Geospatial Data

ESWC 2015 Tutorial 43

SQL with Geometry Types –Functions (cont’d)

• Functions for constructing new geometries out of existing

ones:

• ST_Boundary(A:Geometry):Geometry

• ST_Envelope(A:Geometry):Geometry

• ST_Intersection(A:Geometry, B:Geometry):Geometry

• ST_Union(A:Geometry, B:Geometry):Geometry

• ST_Difference(A:Geometry, B:Geometry):Geometry

• ST_SymDifference(A:Geometry, B:Geometry):Geometry

• ST_Buffer(A:Geometry, distance:Double):Geometry

Page 46: ESWC2015 - Tutorial on Publishing and Interlinking Linked Geospatial Data

ESWC 2015 Tutorial 44

Geospatial Relational DBMS

• The OpenGIS Simple Features Access Standard is today been

used in all relational DBMS with a geospatial extension.

• The abstract data type mechanism of the DBMS allows

the representation of all kinds of geospatial data types

supported by the standard.

• The query language (SQL) offers the functions of the

standard for querying data of these types.

Page 47: ESWC2015 - Tutorial on Publishing and Interlinking Linked Geospatial Data

• The book Geographic Information Systems and Science is a nice introduction to GIS. See: http://eu.wiley.com/WileyCDA/WileyTitle/productCd-EHEP001475.html

• The following papers present the DE-9IM model:

Eliseo Clementini, Paolino Di Felice and Peter van Oosterom.

A Small Set of Formal Topological Relationships Suitable for End-User Interaction. SSD 1993: 277-295

http://link.springer.com/chapter/10.1007%2F3-540-56869-7_16

E. Clementini and P. Felice. A Comparison of Methods for Representing Topological Relationships. Information Sciences 80 (1994), pp. 1-34.

http://www.sciencedirect.com/science/article/pii/106901159400033X The paper

• The paper below surveys a lot of interesting results on the RCC calculus:J. Renz, B. Nebel, Qualitative Spatial Reasoning using Constraint

Calculi, in: M. Aiello, I. Pratt-Hartmann and J. van Benthem (eds.),

Handbook of Spatial Logics, pp. 161–215, 2007, Springer.http://users.cecs.anu.edu.au/~jrenz/papers/renz-nebel-los.pdf

• The two OGC standards mentioned in the slides.

Readings

Page 48: ESWC2015 - Tutorial on Publishing and Interlinking Linked Geospatial Data

Part 2:

Spatial and Temporal Data in RDF:

stRDF/stSPARQL and GeoSPARQL

ESWC 2015 Tutorial

Publishing and Interlinking Linked Geospatial Data

Dept. of Informatics and TelecommunicationsNational and Kapodistrian University of Athens

Page 49: ESWC2015 - Tutorial on Publishing and Interlinking Linked Geospatial Data

ESWC 2015 Tutorial 2

Common Approach

• The two proposals (stRDF/stSPARQL and GeoSPARQL) offer constructs for:o Developing ontologies for spatial

and temporal data.o Encoding spatial and temporal

data that use these ontologies in RDF.

o Extending SPARQL to query spatial and temporal data.

Page 50: ESWC2015 - Tutorial on Publishing and Interlinking Linked Geospatial Data

ESWC 2015 Tutorial 3

Two Proposals

• stRDF/stSPARQL

• GeoSPARQL

Page 51: ESWC2015 - Tutorial on Publishing and Interlinking Linked Geospatial Data

ESWC 2015 Tutorial 4

The data model stRDF

An extension of RDF for the representation of geospatial information that changes over time.

Geospatial dimension:

Spatial data types are introduced.

Geospatial information is representing using spatial literals of these datatypes.

OGC standards WKT and GML are used for the serialization of spatial literals.

Temporal dimension (later)

Proposed independently and around the same time as GeoSPARQL (starting with an ESWC 2010 paper by Koubarakis and Kyzirakos).

[ Kyzirakos, Karpathiotakis

& Koubarakis 2012 ]

Page 52: ESWC2015 - Tutorial on Publishing and Interlinking Linked Geospatial Data

ESWC 2015 Tutorial 5

strdf:geometry rdf:type rdfs:Datatype;

rdfs:subClassOf rdfs:Literal.

strdf:WKT rdf:type rdfs:Datatype;

rdfs:subClassOf strdf:geometry.

strdf:GML rdf:type rdfs:Datatype;

rdfs:subClassOf strdf:geometry.

Spatial Datatypes

Page 53: ESWC2015 - Tutorial on Publishing and Interlinking Linked Geospatial Data

ESWC 2015 Tutorial 6

Example Ontology: Administrative Geography of Greece

Geometry property

strdf:geometry

strdf:GML strdf:WKT

Page 54: ESWC2015 - Tutorial on Publishing and Interlinking Linked Geospatial Data

ESWC 2015 Tutorial 7

Example Ontology: Administrative Geography of Greece

strdf:geometry

strdf:GML strdf:WKT

Geometry property

Page 55: ESWC2015 - Tutorial on Publishing and Interlinking Linked Geospatial Data

ESWC 2015 Tutorial 8

Example Data in stRDF

gag:Olympia

gag:name "Ancient Olympia";

rdf:type gag:MunicipalCommunity .

Spatial data type

gag:Olympia gag:hasGeometry

"POLYGON((21.5 18.5, 23.5 18.5,

23.5 21, 21.5 21, 21.5 18.5));

<http://www.opengis.net/def/crs/EPSG/0/4326>"^^

strdf:WKT .

Spatial literal

Coordinate Reference

System

Geometry Property

Page 56: ESWC2015 - Tutorial on Publishing and Interlinking Linked Geospatial Data

ESWC 2015 Tutorial

gag:Olympia

rdf:type gag:MunicipalCommunity;

gag:name "Ancient Olympia";

gag:population "184"^^xsd:int;

gag:hasGeometry "POLYGON

(((25.37 35.34,…)))"^^strdf:WKT.

gag:OlympiaMUnit

rdf:type gag:MunicipalityUnit;

gag:name "Municipality Unit of

Ancient Olympia".

gag:OlympiaMunicipality

rdf:type gag:Municipality;

gag:name "Municipality of

Ancient Olympia".

gag:Olympia gag:belongsTo gag:OlympiaMUnit .

gag:OlympiaMUnit gag:belongsTo gag:OlympiaMunicipality.

9

Example (cont’d)

Page 57: ESWC2015 - Tutorial on Publishing and Interlinking Linked Geospatial Data

ESWC 2015 Tutorial 10

More Examples

Corine Land Use/Land Cover (http://www.eea.europa.eu/publications/COR0-landcover )

Burnt Area Products (project TELEIOS,

http://www.earthobservatory.eu/ )

Page 58: ESWC2015 - Tutorial on Publishing and Interlinking Linked Geospatial Data

ESWC 2015 Tutorial 11

Corine Land Use/Land Cover

Page 59: ESWC2015 - Tutorial on Publishing and Interlinking Linked Geospatial Data

ESWC 2015 Tutorial 12

Corine Land Use/Land Cover in stRDF(http://www.linkedopendata.gr )

clc:Area_24015134

rdf:type clc:Area ;

clc:hasCode "312"^^xsd:decimal;

clc:hasID "EU-203497"^^xsd:string;

clc:hasArea_ha "255.5807904"^^xsd:double;

clc:hasGeometry "POLYGON((15.53 62.54,

…))"^^strdf:WKT;

clc:hasLandUse clc:ConiferousForest .

Geometry Property

Page 60: ESWC2015 - Tutorial on Publishing and Interlinking Linked Geospatial Data

ESWC 2015 Tutorial 13

Burnt Area Products (http://www.earthobservatory.eu/ontologies/noaOntology.owl)

Page 61: ESWC2015 - Tutorial on Publishing and Interlinking Linked Geospatial Data

ESWC 2015 Tutorial 14

Burnt Area Products

noa:ba_15

rdf:type noa:BurntArea;

noa:isProducedByProcessingChain

"static thresholds"^^xsd:string;

noa:hasAcquisitionTime

"2010-08-24T13:00:00"^^xsd:dateTime;

noa:hasGeometry "MULTIPOLYGON(((

393801.42 4198827.92, ..., 393008 424131)));

<http://www.opengis.net/def/crs/

EPSG/0/2100>"^^strdf:WKT.

Geometry Property

Page 62: ESWC2015 - Tutorial on Publishing and Interlinking Linked Geospatial Data

ESWC 2015 Tutorial 15

stSPARQL: Geospatial SPARQL 1.1

We define a SPARQL extension function for each function defined in the OpenGIS Simple Features Access standard

Basic functions

Get a property of a geometryxsd:int strdf:dimension(strdf:geometry A)

xsd:string strdf:geometryType(strdf:geometry A)

xsd:int strdf:srid(strdf:geometry A)

Get the desired representation of a geometryxsd:string strdf:asText(strdf:geometry A)

xsd:string strdf:asGML(strdf:geometry A)

Test whether a certain condition holdsxsd:boolean strdf:isEmpty(strdf:geometry A)

xsd:boolean strdf:isSimple(strdf:geometry A)

Page 63: ESWC2015 - Tutorial on Publishing and Interlinking Linked Geospatial Data

ESWC 2015 Tutorial 16

stSPARQL: Geospatial SPARQL 1.1

Functions for testing topological spatial relationships

OGC Simple Features Access

xsd:boolean strdf:equals(strdf:geometry A, strdf:geometry B)

xsd:boolean strdf:disjoint(strdf:geometry A, strdf:geometry B)

xsd:boolean strdf:intersects(strdf:geometry A, strdf:geometry B)

xsd:boolean strdf:touches(strdf:geometry A, strdf:geometry B)

xsd:boolean strdf:crosses(strdf:geometry A, strdf:geometry B)

xsd:boolean strdf:within(strdf:geometry A, strdf:geometry B)

xsd:boolean strdf:contains(strdf:geometry A, strdf:geometry B)

xsd:boolean strdf:overlaps(strdf:geometry A, strdf:geometry B)

xsd:boolean strdf:relate(strdf:geometry A, strdf:geometry B,

xsd:string intersectionPatternMatrix)

Egenhofer

RCC-8

Page 64: ESWC2015 - Tutorial on Publishing and Interlinking Linked Geospatial Data

ESWC 2015 Tutorial 17

stSPARQL: Geospatial SPARQL 1.1

Spatial analysis functions

Construct new geometric objects from existing geometric objects

strdf:geometry strdf:boundary(strdf:geometry A)

strdf:geometry strdf:envelope(strdf:geometry A)

strdf:geometry strdf:convexHull(strdf:geometry A)

strdf:geometry strdf:intersection(strdf:geometry A, strdf:geometry B)

strdf:geometry strdf:union(strdf:geometry A, strdf:geometry B)

strdf:geometry strdf:difference(strdf:geometry A, strdf:geometry B)

strdf:geometry strdf:symDifference(strdf:geometry A, strdf:geometry B)

strdf:geometry strdf:buffer(strdf:geometry A, xsd:double distance, xsd:anyURI units)

Spatial metric functions

xsd:float strdf:distance(strdf:geometry A, strdf:geometry B, xsd:anyURI units)

xsd:float strdf:area(strdf:geometry A)

Spatial aggregate functions

strdf:geometry strdf:union(set of strdf:geometry A)

strdf:geometry strdf:intersection(set of strdf:geometry A)

strdf:geometry strdf:extent(set of strdf:geometry A)

Page 65: ESWC2015 - Tutorial on Publishing and Interlinking Linked Geospatial Data

ESWC 2015 Tutorial 18

stSPARQL: Geospatial SPARQL 1.1

Select clause

Construction of new geometries (e.g., strdf:buffer(?geo, 0.1, uom:metre))

Spatial aggregate functions (e.g., strdf:union(?geo))

Metric functions (e.g., strdf:area(?geo))

Filter clause

Functions for testing topological spatial relationships between spatial terms (e.g.,

strdf:contains(?G1, strdf:union(?G2, ?G3)))

Numeric expressions involving spatial metric functions

(e.g., strdf:area(?G1) ≤ 2*strdf:area(?G2)+1)

Boolean combinations

Having clause

Boolean expressions involving spatial aggregate functions and spatial metric

functions or functions testing for topological relationships between spatial terms

(e.g., strdf:area(strdf:union(?geo))>1)

Page 66: ESWC2015 - Tutorial on Publishing and Interlinking Linked Geospatial Data

ESWC 2015 Tutorial 19

stSPARQL: An example (1/3)

SELECT ?name

WHERE {

?comm rdf:type gag:LocalCommunity;

gag:name ?name;

gag:hasGeometry ?commGeo .

?ba rdf:type noa:BurntArea;

noa:hasGeometry ?baGeo .

FILTER(strdf:overlaps(?commGeo,?baGeo))

}Spatial

Function

Return the names of local communities that have been affected by fires

Page 67: ESWC2015 - Tutorial on Publishing and Interlinking Linked Geospatial Data

ESWC 2015 Tutorial 20

stSPARQL: An example (2/3)

SELECT ?ba ?baGeom

WHERE {

?r rdf:type clc:Region;

clc:hasGeometry ?rGeom;

clc:hasCorineLandUse ?f.

?f rdfs:subClassOf clc:Forest.

?c rdf:type gag:LocalCommunity;

gag:hasGeometry ?cGeom.

?ba rdf:type noa:BurntArea;

noa:hasGeometry ?baGeom.

FILTER( strdf:intersects(?rGeom,?baGeom) &&

strdf:distance(?baGeom,?cGeom,uom:metre) < 200)}

Spatial Functions

Find all burnt forests near local communities

Page 68: ESWC2015 - Tutorial on Publishing and Interlinking Linked Geospatial Data

ESWC 2015 Tutorial

Spatial Function

21

SELECT ?burntArea

(strdf:intersection(?baGeom,

strdf:union(?fGeom))

AS ?burntForest)

WHERE {

?burntArea rdf:type noa:BurntArea;

noa:hasGeometry ?baGeom.

?forest rdf:type clc:Region;

clc:hasLandCover clc:ConiferousForest;

clc:hasGeometry ?fGeom.

FILTER(strdf:intersects(?baGeom,?fGeom))

}

GROUP BY ?burntArea ?baGeom

Compute the parts of burnt areas that lie in coniferous forests.

stSPARQL: An example (3/3)

Spatial Aggregate

Page 69: ESWC2015 - Tutorial on Publishing and Interlinking Linked Geospatial Data

ESWC 2015 Tutorial

Time dimensions in Linked Data

User-defined time: A time value (literal) with no special semantics.

Valid time: The time when a fact (represented by a triple) is true in the modeled reality.

Transaction time: The time when the triple is current in the database.

Page 70: ESWC2015 - Tutorial on Publishing and Interlinking Linked Geospatial Data

ESWC 2015 Tutorial

The time dimension of stRDF: The valid time of triples

The following extensions are introduced in stRDF:• Timeline: the (discrete) value space of the datatype xsd:dateTime of

XML-Schema

• Two kinds of time primitives are supported: time instants and time periods.• A time instant is an element of the time line.

• A time period is an expression of the form [B, E) or [B, E] or (B, E] or (B, E) where B and E

are time instants called the beginning and ending time of the period.

• The new datatype strdf:period is introduced.

23

rdfs:Literal

strdf:WKT strdf:GML

strdf:periodstrdf:geometry

Page 71: ESWC2015 - Tutorial on Publishing and Interlinking Linked Geospatial Data

ESWC 2015 Tutorial

The time dimension of stRDF (cont’d)

• Triples are extended to quads.

• A temporal triple (quad) is an expression of the form s p o t.

where s p o. is an RDF triple and t is a time instant or time

period called the valid time of the triple.

• The temporal constants NOW and UC (“until changed”) are

introduced.

24

Page 72: ESWC2015 - Tutorial on Publishing and Interlinking Linked Geospatial Data

ESWC 2015 Tutorial

An example with valid time

25

Forest

Page 73: ESWC2015 - Tutorial on Publishing and Interlinking Linked Geospatial Data

ESWC 2015 Tutorial 26

Forest

clc:region1 clc:hasLandCover clc:Forest

"[2006-08-25T11:00:00+02, "UC")"^^strdf:period .

An example with valid time

Page 74: ESWC2015 - Tutorial on Publishing and Interlinking Linked Geospatial Data

ESWC 2015 Tutorial

An example with valid time

27

Forest

clc:region1 clc:hasLandCover clc:Forest

"[2006-08-25T11:00:00+02, "UC")"^^strdf:period .

Burnt area

Page 75: ESWC2015 - Tutorial on Publishing and Interlinking Linked Geospatial Data

ESWC 2015 Tutorial 28

Forest Burnt area

noa:ba1 rdf:type noa:BurntArea

"[2007-08-25T11:00:00+02, "UC")"^^strdf:period .

clc:region1 clc:hasLandCover clc:Forest

"[2006-08-25T11:00:00+02, "UC")"^^strdf:period .

An example with valid time

Page 76: ESWC2015 - Tutorial on Publishing and Interlinking Linked Geospatial Data

ESWC 2015 Tutorial 29

Forest Burnt area

noa:ba1 rdf:type noa:BurntArea

"[2007-08-25T11:00:00+02, "UC"))"^^strdf:period .

clc:region1 clc:hasLandCover clc:Forest

"[2006-08-25T11:00:00+02,2007-08-25T11:00:00+02)"^^strdf:period .

An example with valid time

Page 77: ESWC2015 - Tutorial on Publishing and Interlinking Linked Geospatial Data

ESWC 2015 Tutorial 30

Forest Burnt area Agricultural area

clc:region1 clc:hasLandCover clc:AgriculturalArea

"[2009-08-25T11:00:00+02, "UC")"^^strdf:period .

noa:ba1 rdf:type noa:BurntArea

"[2007-08-25T11:00:00+02,2009-08-25T11:00:00+02)"^^strdf:period .

clc:region1 clc:hasLandCover clc:Forest

"[2006-08-25T11:00:00+02,2007-08-25T11:00:00+02)"^^strdf:period .

An example with valid time

Page 78: ESWC2015 - Tutorial on Publishing and Interlinking Linked Geospatial Data

ESWC 2015 Tutorial

The time dimension of stSPARQL

The following extensions are introduced:

• Triple patterns are extended to quad patterns (the last component is a temporal

term: variable or constant)

• Temporal extension functions are introduced:

• Allen's temporal relations (e.g., strdf:after)

• Period constructors (e.g., strdf:period_intersect)

• Temporal aggregates (e.g., strdf:maximalPeriod)

31

Page 79: ESWC2015 - Tutorial on Publishing and Interlinking Linked Geospatial Data

ESWC 2015 Tutorial

• Find the current land cover of all areas in the dataset

SELECT ?clc

WHERE {

?R rdf:type clc:Region .

?R clc:hasLandCover ?clc ?t1 .

FILTER(strdf:during ("NOW", ?t1))

}

Temporal extension function

Temporal constant

Example Query

32

Quad Pattern

Page 80: ESWC2015 - Tutorial on Publishing and Interlinking Linked Geospatial Data

ESWC 2015 Tutorial 33

Two Proposals

• stRDF/stSPARQL

• GeoSPARQL

Page 81: ESWC2015 - Tutorial on Publishing and Interlinking Linked Geospatial Data

ESWC 2015 Tutorial 34

GeoSPARQL

GeoSPARQL is an OGC standard.

Functionalities similar to stRDF/stSPARQL:

Geometries are represented using literals of spatial datatypes.

Literals are serialized using WKT and GML.

The same families of functions are offered for querying geometries.

Functionalities beyond stSPARQL:

High level ontologies inspired from GIS terminology.

Topological relations can now be asserted as well so that reasoning and querying on them is possible.

A query rewriting mechanism.

Functionalities of stSPARQL that are not included in GeoSPARQL:

• Geospatial aggregate functions

• Temporal dimension

Page 82: ESWC2015 - Tutorial on Publishing and Interlinking Linked Geospatial Data

ESWC 2015 Tutorial

GeoSPARQL Components

Core

Topology VocabularyExtension

- relation family

Geometry Extension- serialization- version

Geometry TopologyExtension

- serialization- version - relation family

Query RewriteExtension

- serialization- version - relation family

RDFS Entailment Extension

- serialization- version - relation family

Parameters

• Serialization

• WKT

• GML

• Relation Family

• Simple Features

• RCC-8

• Egenhofer

Page 83: ESWC2015 - Tutorial on Publishing and Interlinking Linked Geospatial Data

ESWC 2015 Tutorial 36

GeoSPARQL Core

Defines two top level classes that can be used to organize geospatial data.

Page 84: ESWC2015 - Tutorial on Publishing and Interlinking Linked Geospatial Data

ESWC 2015 Tutorial 37

GeoSPARQL Geometry Extension

Provides vocabulary for asserting and querying data about the geometric attributes of a feature.

Page 85: ESWC2015 - Tutorial on Publishing and Interlinking Linked Geospatial Data

ESWC 2015 Tutorial 38

Example Ontology: Greek Administrative Geography

Page 86: ESWC2015 - Tutorial on Publishing and Interlinking Linked Geospatial Data

ESWC 2015 Tutorial 39

Greek Administrative Geography

Page 87: ESWC2015 - Tutorial on Publishing and Interlinking Linked Geospatial Data

ESWC 2015 Tutorial 40

Greek Administrative Geography

Page 88: ESWC2015 - Tutorial on Publishing and Interlinking Linked Geospatial Data

ESWC 2015 Tutorial 41

Example Data

gag:Olympia

rdf:type gag:MunicipalCommunity;

gag:name "Ancient Olympia";

gag:population "184"^^xsd:int;

geo:hasGeometry ex:polygon1.

ex:polygon1

rdf:type geo:Geometry;

geo:asWKT "http://www.opengis.net/def/crs/OGC/1.3/CRS84

POLYGON((21.5 18.5,23.5 18.5,

23.5 21,21.5 21,21.5 18.5))"

^^sf:wktLiteral.

Datatype from Geometry extension

Geometry literal

Property from Geometry extension

Property from Geometry extension

Class from Geometry extension

Page 89: ESWC2015 - Tutorial on Publishing and Interlinking Linked Geospatial Data

ESWC 2015 Tutorial 42

Non-Topological Query Functions of the Geometry Extension

The following non-topological query functions are also offered:

geof:distance

geof:buffer

geof:convexHull

geof:intersection

geof:union

geof:difference

geof:symDifference

geof:envelope

geof:boundary

Page 90: ESWC2015 - Tutorial on Publishing and Interlinking Linked Geospatial Data

ESWC 2015 Tutorial 43

GeoSPARQL Topology Vocabulary Extension

The extension is parameterized by the family of topological relations supported.

Topological relations for simple features

The Egenhofer relations e.g., geo:ehMeet

The RCC-8 relations e.g., geo:rcc8ec

Page 91: ESWC2015 - Tutorial on Publishing and Interlinking Linked Geospatial Data

ESWC 2015 Tutorial

gag:Olympia

rdf:type gag:MunicipalCommunity;

gag:name "Ancient Olympia".

gag:OlympiaMUnit

rdf:type gag:MunicipalityUnit;

gag:name "Municipality Unit of

Ancient Olympia".

gag:OlympiaMunicipality

rdf:type gag:Municipality;

gag:name "Municipality of

Ancient Olympia".

gag:Olympia geo:sfWithin gag:OlympiaMUnit .

gag:OlympiaMUnit geo:sfWithin gag:OlympiaMunicipality.

44

Greek Administrative Geography

Simple Features topological

relation

Page 92: ESWC2015 - Tutorial on Publishing and Interlinking Linked Geospatial Data

ESWC 2015 Tutorial 45

GeoSPARQL: An example

SELECT ?m

WHERE {

?m rdf:type gag:MunicipalityUnit.

?m geo:sfContains gag:Olympia.

}

Find the municipality unit that contains the community of Ancient Olympia

Simple Featurestopological relation

Answer: ?m = gag:OlympiaMUnit

Page 93: ESWC2015 - Tutorial on Publishing and Interlinking Linked Geospatial Data

ESWC 2015 Tutorial 46

GeoSPARQL: An example

SELECT ?m

WHERE {

?m rdf:type gag:Municipality.

?m geo:sfContains gag:Olympia.

}

Find the municipality that contains the community of Ancient Olympia

Answer?

Page 94: ESWC2015 - Tutorial on Publishing and Interlinking Linked Geospatial Data

ESWC 2015 Tutorial 47

Example (cont’d)

The answer to the previous query is

?m = gag:OlympiaMunicipality

GeoSPARQL does not tell you how to compute this answer which needs reasoning about the transitivity of relation geo:sfContains.

Options: • Use rules• Use constraint-based techniques

Page 95: ESWC2015 - Tutorial on Publishing and Interlinking Linked Geospatial Data

ESWC 2015 Tutorial 48

The Geometry Topology Extension

• Offers vocabulary for querying topological properties of geometry literals.

• Simple Features• geof:relate

• geof:sfEquals

• geof:sfDisjoint

• geof:sfIntersects

• geof:sfTouches

• geof:sfCrosses

• geof:sfWithin

• geof:sfContains

• geof:sfOverlaps

• Egenhofer (e.g., geof:ehDisjoint)• RCC-8 (e.g., geof:rcc8dc)

Page 96: ESWC2015 - Tutorial on Publishing and Interlinking Linked Geospatial Data

ESWC 2015 Tutorial 49

Example Query

SELECT ?name

WHERE {

?comm rdf:type gag:LocalCommunity;

gag:name ?name;

geo:hasGeometry ?commGeo .

?ba rdf:type noa:BurntArea;

geo:hasGeometry ?baGeo .

FILTER(geof:sfOverlaps(?commGeo,?baGeo))

}Geometry Topology Extension Function

Return the names of local communities that have been affected by fires

Geometry Extension Property

Geometry Extension Property

Page 97: ESWC2015 - Tutorial on Publishing and Interlinking Linked Geospatial Data

ESWC 2015 Tutorial 50

GeoSPARQL Query Rewrite Extension

Provides a collection of RIF rules that use topological extension functions to establish the existence of topological predicates.

Example: given the RIF rule named geor:sfWithin, the serializations of the geometries of gag:Athens and gag:Greece named AthensWKT and GreeceWKT and the fact that

geof:sfWithin(AthensWKT, GreeceWKT)

returns true from the computation of the two geometries, we can derive the triple

gag:Athens geo:sfWithin gag:Greece

One possible implementation is to re-write a given SPARQL query.

Page 98: ESWC2015 - Tutorial on Publishing and Interlinking Linked Geospatial Data

ESWC 2015 Tutorial 51

RIF Rule

Forall ?f1 ?f2 ?g1 ?g2 ?g1Serial ?g2Serial

(?f1[geo:sfWithin->?f2] :-

Or(

And (?f1[geo:hasDefaultGeometry->?g1]

?f2[geo:hasDefaultGeometry->?g2]

?g1[ogc:asGeomLiteral->?g1Serial]

?g2[ogc:asGeomLiteral->?g2Serial]

External(geof:sfWithin (?g1Serial,?g2Serial)))

And (?f1[geo:hasDefaultGeometry->?g1]

?g1[ogc:asGeomLiteral->?g1Serial]

?f2[ogc:asGeomLiteral->?g2Serial]

External(geof:sfWithin (?g1Serial,?g2Serial)))

And (?f2[geo:hasDefaultGeometry->?g2]

?f1[ogc:asGeomLiteral->?g1Serial]

?g2[ogc:asGeomLiteral->?g2Serial]

External(geof:sfWithin (?g1Serial,?g2Serial)))

And (?f1[ogc:asGeomLiteral->?g1Serial]

?f2[ogc:asGeomLiteral->?g2Serial]

External(geof:sfWithin (?g1Serial,?g2Serial)))

))

Feature-

Feature

Feature-

Geometry

Geometry-

Feature

Geometry-

Geometry

Page 99: ESWC2015 - Tutorial on Publishing and Interlinking Linked Geospatial Data

ESWC 2015 Tutorial 52

Example

SELECT ?feature

WHERE {

?feature geo:sfWithin

geonames:OlympiaMunicipality.

}

Find all features that are inside the municipality of Ancient Olympia

Page 100: ESWC2015 - Tutorial on Publishing and Interlinking Linked Geospatial Data

ESWC 2015 Tutorial 53

Rewritten Query

SELECT ?feature

WHERE { {?feature geo:sfWithin geonames:Olympia }

UNION

{ ?feature geo:hasDefaultGeometry ?featureGeom .

?featureGeom geo:asWKT ?featureSerial .

geonames:Olympia geo:hasDefaultGeometry ?olGeom .

?olGeom geo:asWKT ?olSerial .

FILTER (geof:sfWithin (?featureSerial, ?olSerial)) }

UNION { ?feature geo:hasDefaultGeometry ?featureGeom .

?featureGeom geo:asWKT ?featureSerial .

geonames:Olympia geo:asWKT ?olSerial .

FILTER (geof:sfWithin (?featureSerial, ?olSerial)) }

UNION { ?feature geo:asWKT ?featureSerial .

geonames:Olympia geo:hasDefaultGeometry ?olGeom .

?olGeom geo:asWKT ?olSerial .

FILTER (geof:sfWithin (?featureSerial, ?olSerial)) }

UNION {

?feature geo:asWKT ?featureSerial .

geonames:Olympia geo:asWKT ?olSerial .

FILTER (geof:sfWithin (?featureSerial, ?olSerial)) }

Page 101: ESWC2015 - Tutorial on Publishing and Interlinking Linked Geospatial Data

ESWC 2015 Tutorial

Specifies the RDFS entailments that follow from the class and property hierarchies defined in the other components e.g., the Geometry Extension.

Systems should use an implementation of RDFS entailment to allow the derivation of new triples from those already in a graph.

54

GeoSPARQL RDFS Entailment Extension

Page 102: ESWC2015 - Tutorial on Publishing and Interlinking Linked Geospatial Data

ESWC 2015 Tutorial 55

Example

Given the triples

ex:f1 geo:hasGeometry ex:g1 .

geo:hasGeometry rdfs:domain geo:Feature.

we can infer the following triples:

ex:f1 rdf:type geo:Feature .

ex:f1 rdf:type geo:SpatialObject .

Page 103: ESWC2015 - Tutorial on Publishing and Interlinking Linked Geospatial Data

ESWC 2015 Tutorial

Readings

56

• Material from the Strabon web site (http://strabon.di.uoa.gr ).

• The following tutorial paper which introduces to the topic of linked geospatial data:M. Koubarakis, M. Karpathiotakis, K. Kyzirakos, C. Nikolaou and M. Sioutis. Data Models and Query Languages for Linked Geospatial Data. Reasoning Web Summer School 2012.http://strabon.di.uoa.gr/files/survey.pdf

• The following paper which introduces stSPARQL and Strabon:K. Kyzirakos, M. Karpathiotakis and M. Koubarakis. Strabon: A Semantic Geospatial DBMS. 11th International Semantic Web Conference (ISWC 2012). November 11-15, 2012. Boston, USA.http://iswc2012.semanticweb.org/sites/default/files/76490289.pdf

• The following paper which introduces the temporal features of stSPARQL and Strabon:

K. Bereta, P. Smeros and M. Koubarakis. Representing and Querying the Valid Time of Triples for Linked Geospatial Data. In the 10th Extended Semantic Web Conference (ESWC 2013). Montpellier, France. May 26-30, 2013.http://www.strabon.di.uoa.gr/files/eswc2013.pdf

• The GeoSPARQL standard found at http://www.opengeospatial.org/standards/geosparql

Page 104: ESWC2015 - Tutorial on Publishing and Interlinking Linked Geospatial Data

ESWC 2015 Tutorial

Readings (cont’d)

57

• The following paper which introduces the RDFi framework:Charalampos Nikolaou and Manolis Koubarakis. Incomplete Information in RDF. In the 7th International Conference on Web Reasoning and Rule Systems (RR 2013). Mannheim, Germany. July 27-29, 2013.http://cgi.di.uoa.gr/~koubarak/publications/rr2013.pdf

• The following paper which introduces the benchmark Geographica:G. Garbis, K. Kyzirakos and M. Koubarakis. Geographica: A Benchmark for Geospatial RDF Stores. In the 12th International Semantic Web Conference (ISWC 2013). Sydney, Australia. October 21-25, 2013.http://cgi.di.uoa.gr/~koubarak/publications/Geographica.pdf

Page 105: ESWC2015 - Tutorial on Publishing and Interlinking Linked Geospatial Data

Publishing geospatial information

as RDF graphsKostis Kyzirakos, Dimitrianos Savva

Page 106: ESWC2015 - Tutorial on Publishing and Interlinking Linked Geospatial Data

Outline

Mapping relational data to RDF graphs

Mapping non-relational data to RDF graphs

Geospatial Extensions for mapping geospatial data to RDF graphs

Implemented Systems

Demonstration

2

Page 107: ESWC2015 - Tutorial on Publishing and Interlinking Linked Geospatial Data

Mapping relational data to RDF graphs

Sitecode Sitename ReleaseDate …

DE0916391 NTP S-H W 2011-01-27

DE1003301 DOGGERB

ANK

2011-01-27

ProtectedArea

?

Natura 2000 is an ecological network

designated under the Birds Directive and

the Habitats Directive which form the

cornerstone of the nature conservation

policy of the European Union.

http://ec.europa.eu/environment/nature/natura2000/index_en.htm

http://www.eea.europa.eu/data-and-maps/data/natura-6

Page 108: ESWC2015 - Tutorial on Publishing and Interlinking Linked Geospatial Data

Direct Mapping

W3C Recommendation from 2012http://www.w3.org/TR/rdb-direct-mapping/

Relational tables are mapped to classes defined by an RDF vocabulary.

Attributes of each table are mapped to RDF properties that represent the relation between subject and object resources.

Identifiers, class names, properties and instancesare generated automatically following the labels of the input data.

4

Page 109: ESWC2015 - Tutorial on Publishing and Interlinking Linked Geospatial Data

Direct Mapping - Example

Sitecode Sitename ReleaseDate …

DE0916391 NTP S-H W 2011-01-27

DE1003301 DOGGERB

ANK

2011-01-27

ProtectedArea ProtectedArea

xsd:string xsd:date

ReleaseDateSitename

@base <http://foo.example/DB/> .

@prefix rdf: <http://www.w3.org/1999/02-22-rdf-syntax-ns#> .

@prefix xsd: <http://www.w3.org/2001/XMLSchema#> .

<ProtectedArea/Sitecode=DE0916391> rdf:type <ProtectedArea> .

<ProtectedArea/Sitecode=DE0916391> <ProtectedArea#Sitename> "NTP S-H W" .

<ProtectedArea/Sitecode=DE0916391> <ProtectedArea#ReleaseDate>

"2011-01-27"^^xsd:date .

<ProtectedArea/Sitecode=DE1003301> rdf:type <ProtectedArea> .

<ProtectedArea/Sitecode=DE1003301> <ProtectedArea#Sitename> "DOGGERBANK" .

<ProtectedArea/Sitecode=DE1003301> <ProtectedArea#ReleaseDate>

"2011-01-27"^^xsd:date .

Page 110: ESWC2015 - Tutorial on Publishing and Interlinking Linked Geospatial Data

The language R2RML

R2RML is a language for expressing customized mappings from relational databases to RDF graphs

R2RML is a W3C Recommendation from 2012http://www.w3.org/TR/r2rml/

R2RML mappings provide the user with the ability to express the desired transformation of existing relational data into the RDF data model, following a structure and a target vocabulary that is chosen by the user.

6

Page 111: ESWC2015 - Tutorial on Publishing and Interlinking Linked Geospatial Data

The language R2RML (cont’d)

LogicalTable

PredicateObjectMap

GraphMap

TriplesMap SubjectMap

ObjectMap

PredicateMap

RefObjectMap Join

TermMap

Constant

Column

Template

Child

Parent

Page 112: ESWC2015 - Tutorial on Publishing and Interlinking Linked Geospatial Data

The language R2RML (cont’d)

A logical table can be a relational table that is explicitly stored in the

databasean SQL viewan SQL select query

A triples map is a rule that defines how each tuple of the logical table will be mapped to a set of RDF triples. It consists ofa subject map zero or more predicate-object maps.

8

Page 113: ESWC2015 - Tutorial on Publishing and Interlinking Linked Geospatial Data

The language R2RML (cont’d)

A subject map is a rule that defines how to generate the URI that will be the subject of each generated RDF triple.

A predicate-object map consists of predicate maps and object maps.

A predicate map defines the RDF property to be used to relate the subject and the object of the generated triple.

An object map defines how to generate the object of the triple which originates from the current row of the logical table.

9

Page 114: ESWC2015 - Tutorial on Publishing and Interlinking Linked Geospatial Data

The language R2RML (cont’d)

Subject, predicate, object and graph maps are term maps. A term map is a function that generates an RDF term from a logical table. Three types of term maps are defined:constant-valued term mapscolumn-valued term maps template-valued term maps

10

Page 115: ESWC2015 - Tutorial on Publishing and Interlinking Linked Geospatial Data

The language R2RML (cont’d)

A referencing object map allows using the subjects of another triples map as the objects generated by a predicate-object map. Optionally, it has one or more join condition

properties.

11

PredicateObjectMap

RefObjectMap

TriplesMap

JoinConditioncolumn name

column name

source: http://www.w3.org/TR/r2rml/#dfn-predicate-map

rr:child

rr:parent

rr:join

Condition*

rr:parent

TriplesMaprr:object

Map

Page 116: ESWC2015 - Tutorial on Publishing and Interlinking Linked Geospatial Data

The language R2RML – Example

Sitecode Sitename ReleaseDate …

DE0916391 NTP S-H W 2011-01-27

DE1003301 DOGGERB

ANK

2011-01-27

ProtectedArea ProtectedArea

xsd:string

Sitename

@base <http://foo.example/DB/> .

<NaturaMapping>

rr:subjectMap [

rr:template "ProtectedArea/SiteCode={SiteCode}";

rr:class <ProtectedArea> ];

rr:predicateObjectMap [

rr:predicate ProtectedArea:SiteName;

rr:objectMap [ rr:column "SiteName"; ]; ] .

<ProtectedArea/Sitecode=DE0916391> rdf:type <ProtectedArea> .

<ProtectedArea/Sitecode=DE0916391> <ProtectedArea#Sitename> "NTP S-H W" .

<ProtectedArea/Sitecode=DE1003301> rdf:type <ProtectedArea> .

<ProtectedArea/Sitecode=DE1003301> <ProtectedArea#Sitename> "DOGGERBANK" .

Page 117: ESWC2015 - Tutorial on Publishing and Interlinking Linked Geospatial Data

<ogr:FeatureCollection>

<gml:featureMember>

<ogr:waterways fid="waterways.128">

<ogr:osm_id>8108139</ogr:osm_id>

<ogr:name>Lech</ogr:name>

<ogr:type>river</ogr:type>

<ogr:geometryProperty>

<gml:LineString>

<gml:coordinates>

10.9034096,47.7996669

10.9037025,47.8003338 …

</gml:coordinates>

</gml:LineString>

</ogr:geometryProperty>

</ogr:waterways>

</gml:featureMember>

</ogr:FeatureCollection>

Mapping non-relational data to RDF graphs

?

OpenStreetMap is a collaborative project

for publishing free maps of the world. OSM

maintains a community-driven global

editable map that gathers map data in a

crowdsourcing fashion.

http://www.openstreetmap.org/

Page 118: ESWC2015 - Tutorial on Publishing and Interlinking Linked Geospatial Data

RDF Mapping Language (RML)

RML is a recently proposed mapping language that defines how to map heterogeneous sources into RDF.http://semweb.mmlab.be/rml/spec.html

RML is defined as a superset of the W3C-standard R2RML

R2RML RML

Logical Table rr:logicalTable Logical Source rml:logicalSource

Table Name rr:tableName URI rml:source

column rr:column reference rml:reference

SQL Reference Formulation rml:referenceFormulation

per row iteration defined iterator rml:iterator

source: http://semweb.mmlab.be/rml/RML_R2RML.html

Page 119: ESWC2015 - Tutorial on Publishing and Interlinking Linked Geospatial Data

RML Overview

LogicalSource

PredicateObjectMap

GraphMap

TriplesMap SubjectMap

ObjectMap

PredicateMap

RefObjectMap JoinChild

Parent

TermMap

Constant

Reference

Template

Source

Iterator

Reference Formulation

Page 120: ESWC2015 - Tutorial on Publishing and Interlinking Linked Geospatial Data

RML extensions

A logical source refers to the input dataset that will be converted to an RDF graph.

Each logical source has a source property pointing to input data a logical iterator that defines the iteration pattern over

the input data source an optional reference formulation property that defines

the query language that may be used (e.g., SQL2008, XPath, JSONPath)

An RML reference is a term map that refers to a column name (SQL, CSV), an XML element or attribute, or an JSON object.

Page 121: ESWC2015 - Tutorial on Publishing and Interlinking Linked Geospatial Data

<ogr:FeatureCollection>

<gml:featureMember>

<ogr:waterways fid="waterways.128">

<ogr:osm_id>8108139</ogr:osm_id>

<ogr:name>Lech</ogr:name>

<ogr:type>river</ogr:type>

<ogr:geometryProperty>

<gml:LineString>

<gml:coordinates>

10.9034096,47.7996669

10.9037025,47.8003338 …

</gml:coordinates>

</gml:LineString>

</ogr:geometryProperty>

</ogr:waterways>

</gml:featureMember>

</ogr:FeatureCollection>

RML Example <#waterways>

rml:logicalSource [

rml:source "/home/leo/osm.gml";

rml:referenceFormulation ql:XPath;

rml:iterator "/ogr:FeatureCollection

/gml:featureMember

/ogr:waterways";

];

rr:subjectMap [

rr:template

"http://www.example.com/id/{@fid}";

rr:class onto:waterways;

];

rr:predicateObjectMap [

rr:predicate onto:hasOgr-Name;

rr:objectMap [

rr:datatype xsd:string;

rml:reference "ogr:name";

]; ] .

ex_id:waterways.128 rdf:type onto:waterways ;

onto:hasOgr-Name "Lech" ;

onto:hasFid "waterways.128"^^xsd:ID ;

onto:hasOgr-Osm_id "8108139" ;

onto:hasOgr-Type "river" .

Page 122: ESWC2015 - Tutorial on Publishing and Interlinking Linked Geospatial Data

Mapping geospatial data to RDF graphs

Geospatial data are available in formats suchas:

• ESRI shape files

• KML documents

• GeoJSON documents

• XML documents

Geospatial data may also be stored in spatially-enabled relational databases.

Page 123: ESWC2015 - Tutorial on Publishing and Interlinking Linked Geospatial Data

Extending R2ML with transformation-valued term maps

LogicalTable

PredicateObjectMap

GraphMap

TriplesMap SubjectMap

ObjectMap

PredicateMap

RefObjectMap Join

TermMap

Constant

Column

Template

Child

Parent

Function

ArgumentMap

ArgumentMap

Function

Page 124: ESWC2015 - Tutorial on Publishing and Interlinking Linked Geospatial Data

Extending RML with transformation-valued term maps

LogicalSource

PredicateObjectMap

GraphMap

TriplesMap SubjectMap

ObjectMap

PredicateMap

RefObjectMap Join

TermMap

Source

Iterator

Reference Formulation

Constant

Column

Template

Child

Parent

Function

ArgumentMap

ArgumentMap

Function

Page 125: ESWC2015 - Tutorial on Publishing and Interlinking Linked Geospatial Data

Transformation-valued term maps

A transformation-valued term maps is a term map that generates an RDF term by applying a SPARQL extension function on one or more term maps.

A transformation-valued term map has exactly one rrx:function property that defines a

SPARQL extension function that performs the desired transformation

one rrx:argumentMap property that has as range an rdf:List of term maps that define the arguments to be passed to the transformation function

Page 126: ESWC2015 - Tutorial on Publishing and Interlinking Linked Geospatial Data

Transformation-valued term maps (cont’d)

Page 127: ESWC2015 - Tutorial on Publishing and Interlinking Linked Geospatial Data

Extending join conditions

PredicateObjectMap

RefObjectMap

TriplesMap

JoinCondition

column name

column name

rr:child

rrx:

function

rr:join

Condition*

rr:parent

TriplesMaprr:object

Map

rdf:List

IRIrefOr

Function

rr:parent

rrx:

argument

Map

Page 128: ESWC2015 - Tutorial on Publishing and Interlinking Linked Geospatial Data

Example

Sitecode Sitename Geometry …

DE0916391 NTP S-H W POLYGON((…))

DE1003301 DOGGERB

ANK

POLYGON((…))

ProtectedArea

ProtectedArea

xsd:string

geo:hasGeometry

<NaturaGeometryMapping>

rr:subjectMap [

rr:template "ProtectedArea/Geometry/SiteCode={SiteCode}";

rr:class geo:Geometry ];

rr:predicateObjectMap [

rr:predicate geo:dimension;

rr:objectMap [

rrx:function strdf:dimension;

rrx:argumentMap ( [rr:column "`Geom`"] ); ]; ] .

<ProtectedArea/Geometry/Sitecode=DE0916391>

rdf:type <ProtectedArea> ;

geo:dimension "2"^xsd:integer .

geo:Geometry

geo:

dimensiongeo:asWKT

geo:wktLiteral

Page 129: ESWC2015 - Tutorial on Publishing and Interlinking Linked Geospatial Data

Example

Sitecode Sitename Geom …

DE0916391 NTP S-H W POLYGON((…))

DE1003301 DOGGERB

ANK

POLYGON((…))

ProtectedArea

<NaturaGeometryMapping>

rr:subjectMap [

rr:template

"ProtectedArea/Geometry/SiteCode={SiteCode}";

rr:class geo:Geometry ];

rr:predicateObjectMap [

rr:predicate geo:sfIntersects;

rr:objectMap [

rr:parentTriplesMap <#waterwaysGeom> ;

rr:joinCondition [

rrx:function geof:intersection;

rrx:argumentMap (

[rr:column "`Geom`"] ;

[rml:reference "ogr:geometryProperty“;

rr:parentTriplesMap <#waterwaysGeom>]

); ] ; ]; ] .

natura:DE0916391 geo:sfIntersects osm-id:waterways.128 .

<ogr:FeatureCollection>

<gml:featureMember>

<ogr:waterways fid="waterways.128">

<ogr:osm_id>8108139</ogr:osm_id>

<ogr:geometryProperty>

<gml:LineString>

<gml:coordinates>

10.9034096,47.7996669 …

</gml:coordinates>

</gml:LineString>

</ogr:geometryProperty>

</ogr:waterways>

</gml:featureMember>

</ogr:FeatureCollection>

OSM Waterways

<#waterwaysGeom>

rml:logicalSource [

rml:source "/home/leo/osm.gml";

rml:referenceFormulation ql:XPath;

rml:iterator "/ogr:FeatureCollection

/gml:featureMember

/ogr:waterways";

];

rr:subjectMap [

rr:template

"http://www.osm.org/id/{@fid}";

rr:class onto:waterways;

].

Page 130: ESWC2015 - Tutorial on Publishing and Interlinking Linked Geospatial Data

Implemented Systems

Direct Mapping processors: SquirellRDF

R2RML processors: D2RQ Platform OpenLink Virtuoso Ultrawrap Morph Ontop Oracle

RML processor Processor by iMinds Lab, Ghent University

Other Mapping Language: Triplify

Geospatial capabilities

So far: Geometry2RDF Sparqlify TripleGeo GeoTriples

26

Page 131: ESWC2015 - Tutorial on Publishing and Interlinking Linked Geospatial Data

Custom MappingLanguage

DirectMapping

R2RML RMLSPARQLquery

evaluation

AutomaticMapping

Generation

Geospatialsupport

OpenLinkVirtuoso

✔ ✖ ✔*✖

✔ ✖ ✖

RDF-RDB2RDF ✖ ✔ ✔ ✖ ✖ ✖ ✖

D2RQ Platform ✔ ✖ ✔ ✖ ✔ ✔ ✖

Db2triples ✖ ✔ ✔ ✖ ✖ ✖ ✖

Morph ✔ ✖ ✔ ✖ ✔ ? ✖

Sparqlify ✔ ✖ ✖* ✖ ✔ ✖ ✔

Ontop ✔ ✖ ✔ ✖ ✔ ✖ ✖*

Ultrawrap ✔* ✔ ✔ ✖ ✔ ✖ ✖

Oracle ✖ ✔ ✔ ✖ ✔ ✔ ✖

Geometry2RDF ✖ ✔ ✖ ✖ ✖ ✖ ✔*

TriplesGeo ✖ ✔ ✖ ✖ ✖ ✖ ✔*

iMinds lab RMLprocessor

✖ ✖ ✔ ✔ ✖ ✖ ✖

GeoTriples ✖ ✖ ✔ (✔) ✖* ✔ ✔

Page 132: ESWC2015 - Tutorial on Publishing and Interlinking Linked Geospatial Data

Comparison of Geo2RDF tools

DirectMapping

R2RML RMLAutomaticMapping

Generation

GeoSPARQLcompliance

RDBMSESRI

Shapefile

GMLGeo

JSON

Geometry2RDF ✔ ✖ ✖ ✖ ✖ ✔ ✖ ✖ ✖

TriplesGeo ✔ ✖ ✖ ✖ (✔) ✔ ✔ ✖ ✖

GeoTriples ✖ ✔ ✔ ✔ ✔ ✔ ✔ ✔ ✔

Page 133: ESWC2015 - Tutorial on Publishing and Interlinking Linked Geospatial Data

GeoTriples

Open Source software

Released under Mozilla Public Licence v2.0

Available at: https://github.com/LinkedEOData/GeoTriples

Extends the D2RQ Platform

Extends the iMinds lab RML processor

Provides both a graphical user interface and a command line interface

29

Page 134: ESWC2015 - Tutorial on Publishing and Interlinking Linked Geospatial Data

Architecture of GeoTriples

30

Earth Obsevation

Acquisitions

Page 135: ESWC2015 - Tutorial on Publishing and Interlinking Linked Geospatial Data

Automatic generation of R2RML mappings (cont’d)

Generate two triples maps for each table that has a geometry column. Thematic triples map for the non-geometric information Spatial triples map for the geometric information

The spatial triples map contains multiple transformation functions over the input geometries in order to generate a GeoSPARQL compliant dataset.

31

NaturaGeometryNaturaArea

geo:

hasGeometry

(rr:joinCondition)

Page 136: ESWC2015 - Tutorial on Publishing and Interlinking Linked Geospatial Data

Automatic generation of RML mappings for GML documents

Each geometric object is mapped to a geo:Geometryinstance

For each geometric object we generate a set of predicate object maps that use the appropriate transformation functions for producing a GeoSPARQL compliant dataset

Each simple element is mapped to a predicate object map

Each non simple element is mapped to a triples map

Appropriate mappings are generated for linking nestedelements

32

Mapping

GeneratorXSDRML

mapping

Page 137: ESWC2015 - Tutorial on Publishing and Interlinking Linked Geospatial Data

Demonstration

33

Page 138: ESWC2015 - Tutorial on Publishing and Interlinking Linked Geospatial Data

Discovering Spatial and Temporal Links

among RDF Graphs

Publishing and Interlinking Linked Geospatial Data In Conjunction with the 12th Extended Semantic Web Conference

Portoroz, Slovenia, 1st June 2015

Presenter: Panayiotis Smeros

Page 139: ESWC2015 - Tutorial on Publishing and Interlinking Linked Geospatial Data

01/06/2015 Discovering Spatial and Temporal Links among RDF Graphs 2

Outline

• Introduction to Entity Resolution and Link

Discovery

– Examples, Definitions, Common Problems

• Spatial Entity Resolution

• Spatial and Temporal Link Discovery

– Background and Developed Methods

– Extensions to the Silk Framework

– Hands-on

Page 140: ESWC2015 - Tutorial on Publishing and Interlinking Linked Geospatial Data

01/06/2015 Discovering Spatial and Temporal Links among RDF Graphs 3

Entities in Real-World

source

source

Most of our knowledge about the world is based on entities

and their relations:

Page 141: ESWC2015 - Tutorial on Publishing and Interlinking Linked Geospatial Data

01/06/2015 Discovering Spatial and Temporal Links among RDF Graphs 4

Entities in Data-World

Portoroz Portorož بورتوروز Порторож Πορτορόζ Portorose Портороз Порторожу Portorožu Порторож

Portorož (Italian: Portorose, literally "Port of Roses"), is an Adriatic - Mediterranean coastal settlement in the Municipality of Piran in southwestern Slovenia. Its modern development began in the late 19th century with appearance of first health resorts.

http://www.geonames.org/3192682/portoroz.html http://en.wikipedia.org/wiki/Portoroz http://www.portoroz.si/en/ …

source

Many names, descriptions or IDs (URIs) are used for the

same real-world entity:

Page 142: ESWC2015 - Tutorial on Publishing and Interlinking Linked Geospatial Data

01/06/2015 Discovering Spatial and Temporal Links among RDF Graphs 5

Content Providers

News about Portoroz Reviews of hotels in Portoroz

Pictures about Portoroz

Videos for Portoroz

Wiki pages about Portoroz

Social networks in Portoroz

Many applications provide valuable information about each of

these entities:

Page 143: ESWC2015 - Tutorial on Publishing and Interlinking Linked Geospatial Data

01/06/2015 Discovering Spatial and Temporal Links among RDF Graphs 6

Content Providers

News about Portoroz Reviews of hotels in Portoroz

Pictures about Portoroz

Videos for Portoroz

Wiki pages about Portoroz

Social networks in Portoroz

Many applications provide valuable information about each of

these entities:

Solution?

Page 144: ESWC2015 - Tutorial on Publishing and Interlinking Linked Geospatial Data

01/06/2015 Discovering Spatial and Temporal Links among RDF Graphs 7

Entity Resolution

Problem of understanding that two (or more) entities in data-world

are references of the same real-world entity. [Christen, TKDE’11]

Page 145: ESWC2015 - Tutorial on Publishing and Interlinking Linked Geospatial Data

01/06/2015 Discovering Spatial and Temporal Links among RDF Graphs 8

Entity Resolution (Example)

DBpedia

Entity

name = PORTOROZ

population = 2,849

GeoNames

Entity

name = Portorose

population = 2,851

Problem of understanding that two (or more) entities in data-world

are references of the same real-world entity. [Christen, TKDE’11]

Page 146: ESWC2015 - Tutorial on Publishing and Interlinking Linked Geospatial Data

01/06/2015 Discovering Spatial and Temporal Links among RDF Graphs 9

Entity Resolution (Example)

DBpedia

Entity

name = PORTOROZ

population = 2,849

GeoNames

Entity

name = Portorose

population = 2,851

sameAs

Problem of understanding that two (or more) entities in data-world

are references of the same real-world entity. [Christen, TKDE’11]

Page 147: ESWC2015 - Tutorial on Publishing and Interlinking Linked Geospatial Data

01/06/2015 Discovering Spatial and Temporal Links among RDF Graphs 10

Spatial Entity Resolution (Example)

DBpedia

Entity

name = PORTOROZ

population = 2,849

GeoNames

Entity

name = Portorose

population = 2,851

sameAs

location = 45.51663, 13.57996 location = 45.51661, 13.57998

Problem of understanding that two (or more) entities in data-world

are references of the same real-world entity. [Christen, TKDE’11]

Page 148: ESWC2015 - Tutorial on Publishing and Interlinking Linked Geospatial Data

Entity Resolution (Definition)

Let 𝑆 and 𝑇 be two sets of entities. We define a distance

(similarity) function 𝑑𝑠𝑖𝑚𝑖𝑙𝑎𝑟𝑖𝑡𝑦 and a distance (similarity)

threshold 𝜃𝑑𝑠𝑖𝑚𝑖𝑙𝑎𝑟𝑖𝑡𝑦 as follows:

𝑑𝑠𝑖𝑚𝑖𝑙𝑎𝑟𝑖𝑡𝑦: 𝑆 × T → [0,1] , 𝜃𝑑𝑠𝑖𝑚𝑖𝑙𝑎𝑟𝑖𝑡𝑦∈ 0,1

We define the set of discovered similarity links 𝐷𝐿𝑠𝑖𝑚𝑖𝑙𝑎𝑟𝑖𝑡𝑦 as

follows:

𝐷𝐿𝑠𝑖𝑚𝑖𝑙𝑎𝑟𝑖𝑡𝑦 = s, sameAs, t 𝑠 ∈ 𝑆 𝑡 ∈ 𝑇 𝑑𝑠𝑖𝑚𝑖𝑙𝑎𝑟𝑖𝑡𝑦 𝑠, 𝑡 < 𝜃𝑑𝑠𝑖𝑚𝑖𝑙𝑎𝑟𝑖𝑡𝑦}

01/06/2015 Discovering Spatial and Temporal Links among RDF Graphs 11

Page 149: ESWC2015 - Tutorial on Publishing and Interlinking Linked Geospatial Data

Link Discovery

01/06/2015 Discovering Spatial and Temporal Links among RDF Graphs 12

Source Source

Link Discovery is the fourth and the most important Linked Data

Principle.

Establish semantic relations between entities in order to enrich the

information that is known about them. [Bizer et al., IJSWIS’06]

Page 150: ESWC2015 - Tutorial on Publishing and Interlinking Linked Geospatial Data

Link Discovery (Definition)

Let 𝑆 and 𝑇 be two sets of entities and 𝑅 the set of relations

that can be discovered between entities. For a relation 𝑟 ∈ 𝑅,

w.l.o.g., we define a distance function 𝑑𝑟 and a distance

threshold 𝜃𝑑𝑟 as follows:

𝑑𝑟: S × T → [0,1] , 𝜃𝑑𝑟∈ 0,1

We define the set of discovered links for relation 𝑟 (𝐷𝐿𝑟) as

follows:

𝐷𝐿𝑟 = s, r, t 𝑠 ∈ 𝑆 𝑡 ∈ 𝑇 𝑑𝑟 𝑠, 𝑡 < 𝜃𝑑𝑟}

01/06/2015 Discovering Spatial and Temporal Links among RDF Graphs 13

Page 151: ESWC2015 - Tutorial on Publishing and Interlinking Linked Geospatial Data

01/06/2015 Discovering Spatial and Temporal Links among RDF Graphs 14

Link Discovery (Example)

Natura (2000) - Fields Fields - OSM Water Bodies

Page 152: ESWC2015 - Tutorial on Publishing and Interlinking Linked Geospatial Data

contains

intersects

01/06/2015 Discovering Spatial and Temporal Links among RDF Graphs 15

Natura (2000) - Fields

Link Discovery (Example)

Fields - OSM Water Bodies

Page 153: ESWC2015 - Tutorial on Publishing and Interlinking Linked Geospatial Data

01/06/2015 Discovering Spatial and Temporal Links among RDF Graphs 16

Main Problem: Heterogeneity

• Different Data Providers create Heterogeneous

Datasets

– Example: Literal Heterogeneity (case, language, etc).

• We focus on:

– Heterogeneity in the Representation of Geospatial

Information in RDF

– Heterogeneity in the Representation of Temporal

Information in RDF

name = PORTOROZ name = Portorose

Page 154: ESWC2015 - Tutorial on Publishing and Interlinking Linked Geospatial Data

01/06/2015 Discovering Spatial and Temporal Links among RDF Graphs 17

Heterogeneity in the Representation of

Geospatial Information in RDF

_:1 rdf:type geo:Geometry .

_:1 geo:hasGeometry

"<http://www.opengis.net/def/crs/EPSG/0/4326>

POINT(10 20)"^^geo:wktLiteral .

_:1 rdf:type strdf:Geometry .

_:1 strdf:hasGeometry

"<gml:Point crsName="EPSG:2100"><gml:coordinates>10,20

</gml:coordinates></gml:Point>"^^strdf:GML .

_:1 rdf:type wgs84Geo:Point .

_:1 wgs84Geo:lat “10“^^xsd:double .

_:1 wgs84Geo:long “20“^^xsd:double .

Page 155: ESWC2015 - Tutorial on Publishing and Interlinking Linked Geospatial Data

01/06/2015 Discovering Spatial and Temporal Links among RDF Graphs 18

Heterogeneity in the Representation of

Geospatial Information in RDF

_:1 rdf:type geo:Geometry .

_:1 geo:hasGeometry

"<http://www.opengis.net/def/crs/EPSG/0/4326>

POINT(10 20)"^^geo:wktLiteral .

_:1 rdf:type strdf:Geometry .

_:1 strdf:hasGeometry

"<gml:Point crsName="EPSG:2100"><gml:coordinates>10,20

</gml:coordinates></gml:Point>"^^strdf:GML .

• Different Vocabularies

Page 156: ESWC2015 - Tutorial on Publishing and Interlinking Linked Geospatial Data

01/06/2015 Discovering Spatial and Temporal Links among RDF Graphs 19

Heterogeneity in the Representation of

Geospatial Information in RDF

_:1 rdf:type geo:Geometry .

_:1 geo:hasGeometry

"<http://www.opengis.net/def/crs/EPSG/0/4326>

POINT(10 20)"^^geo:wktLiteral .

_:1 rdf:type strdf:Geometry .

_:1 strdf:hasGeometry

"<gml:Point crsName="EPSG:2100"><gml:coordinates>10,20

</gml:coordinates></gml:Point>"^^strdf:GML .

• Different Vocabularies

• Different Serializations of Geometries

Page 157: ESWC2015 - Tutorial on Publishing and Interlinking Linked Geospatial Data

01/06/2015 Discovering Spatial and Temporal Links among RDF Graphs 20

Heterogeneity in the Representation of

Geospatial Information in RDF

_:1 rdf:type geo:Geometry .

_:1 geo:hasGeometry

"<http://www.opengis.net/def/crs/EPSG/0/4326>

POINT(10 20)"^^geo:wktLiteral .

_:1 rdf:type strdf:Geometry .

_:1 strdf:hasGeometry

"<gml:Point crsName="EPSG:2100"><gml:coordinates>10,20

</gml:coordinates></gml:Point>"^^strdf:GML .

• Different Vocabularies

• Different Serializations of Geometries

• Geometries expressed in Different Coordinate

Reference Systems (CRS)

Page 158: ESWC2015 - Tutorial on Publishing and Interlinking Linked Geospatial Data

01/06/2015 Discovering Spatial and Temporal Links among RDF Graphs 21

Heterogeneity in the Representation of

Geospatial Information in RDF

source

Page 159: ESWC2015 - Tutorial on Publishing and Interlinking Linked Geospatial Data

01/06/2015 Discovering Spatial and Temporal Links among RDF Graphs 22

Heterogeneity in the Representation of

Geospatial Information in RDF

• Different Sampling Values

• Different Granularity

• Different Rounding Effects

source

Page 160: ESWC2015 - Tutorial on Publishing and Interlinking Linked Geospatial Data

01/06/2015 Discovering Spatial and Temporal Links among RDF Graphs 23

Heterogeneity in the Representation of

Temporal Information in RDF

_:1 ex:hasBirthday "1989-09-

24T11:05:00+01:00"xsd:dateTime .

_:1 ex:hasAffiliation ex:UoA

"[2007-10-15T00:00:00+03:00,

2013-10-15T00:00:00+04:00)"^^strdf:Period .

Page 161: ESWC2015 - Tutorial on Publishing and Interlinking Linked Geospatial Data

01/06/2015 Discovering Spatial and Temporal Links among RDF Graphs 24

Heterogeneity in the Representation of

Temporal Information in RDF

_:1 ex:hasBirthday "1989-09-

24T11:05:00+01:00"xsd:dateTime .

_:1 ex:hasAffiliation ex:UoA

"[2007-10-15T00:00:00+03:00,

2013-10-15T00:00:00+04:00)"^^strdf:Period .

• Different Vocabularies

Page 162: ESWC2015 - Tutorial on Publishing and Interlinking Linked Geospatial Data

01/06/2015 Discovering Spatial and Temporal Links among RDF Graphs 25

Heterogeneity in the Representation of

Temporal Information in RDF

_:1 ex:hasBirthday "1989-09-

24T11:05:00+01:00"xsd:dateTime .

_:1 ex:hasAffiliation ex:UoA

"[2007-10-15T00:00:00+03:00,

2013-10-15T00:00:00+04:00)"^^strdf:Period .

• Different Vocabularies

• Different Time Zones

Page 163: ESWC2015 - Tutorial on Publishing and Interlinking Linked Geospatial Data

01/06/2015 Discovering Spatial and Temporal Links among RDF Graphs 26

Heterogeneity in the Representation of

Temporal Information in RDF

_:1 ex:hasBirthday "1989-09-

24T11:05:00+01:00"xsd:dateTime .

_:1 ex:hasAffiliation ex:UoA

"[2007-10-15T00:00:00+03:00,

2013-10-15T00:00:00+04:00)"^^strdf:Period .

• Different Vocabularies

• Different Time Zones

• Time Instants and Periods

Page 164: ESWC2015 - Tutorial on Publishing and Interlinking Linked Geospatial Data

Outline

• Introduction to Entity Resolution and Link

Discovery

– Examples, Definitions, Common Problems

• Spatial Entity Resolution

• Spatial and Temporal Link Discovery

– Background and Developed Methods

– Extensions to the Silk Framework

– Hands-on

01/06/2015 Discovering Spatial and Temporal Links among RDF Graphs 27

Page 165: ESWC2015 - Tutorial on Publishing and Interlinking Linked Geospatial Data

01/06/2015 Discovering Spatial and Temporal Links among RDF Graphs 28

Spatial Entity Resolution (Example

Revisited)

DBpedia

Entity

name = PORTOROZ

population = 2,849

GeoNames

Entity

name = Portorose

population = 2,851

sameAs

location = 45.51663, 13.57996 location = 45.51661, 13.57998

Problem of understanding that two (or more) entities in data-world

are references of the same real-world entity. [Christen, TKDE’11]

Page 166: ESWC2015 - Tutorial on Publishing and Interlinking Linked Geospatial Data

01/06/2015 Discovering Spatial and Temporal Links among RDF Graphs 29

Spatial Entity Resolution (1/4)

• Location Name Similarity

– Edit, Jaccard distance

• Location Similarity

– Euclidean distance

• Location Type Similarity

– (e.g. type “river” is similar to type “stream”)

Combines the above similarities to compute the

overall similarity between entities

Page 167: ESWC2015 - Tutorial on Publishing and Interlinking Linked Geospatial Data

01/06/2015 Discovering Spatial and Temporal Links among RDF Graphs 30

Spatial Entity Resolution (2/4)

• Similarity measure: Hausdorff Distance – Intuitively Hausdorff Distance is defined as the

largest distance between the closest points of two geometric shapes

• Handling Geospatial Heterogeneity – Converts geometries to a common

vocabulary (NeoGeo)

– Assumes WGS-84 CRS

• Optimization – Simplifies Geometries with Ramer-Douglas-Peucker algorithm

Page 168: ESWC2015 - Tutorial on Publishing and Interlinking Linked Geospatial Data

01/06/2015 Discovering Spatial and Temporal Links among RDF Graphs 31

Spatial Entity Resolution (3/4)

• Heuristic Combination of:

– URI Similarity

– Label Similarity

• Considering the language of the labels

– Location Similarity

• Assuming the W3C Geo vocabulary

– Geometric Similarity

• Minimum Distance between two Geometries

Page 169: ESWC2015 - Tutorial on Publishing and Interlinking Linked Geospatial Data

01/06/2015 Discovering Spatial and Temporal Links among RDF Graphs 32

Spatial Entity Resolution (4/4)

• Non-Spatial Criteria

– Implemented within the LIMES framework

• Geometric Similarity

– Hausdorff Distance

– Optimizations

• Bounding Circle: Avoids useless comparisons

μ(s, t) = δ(ζ(s), ζ(t)) − r (s) − r (t) > θ ⇒ δ(s, t) > θ

• Space tiling: Reduces the quadratic number of comparisons

Page 170: ESWC2015 - Tutorial on Publishing and Interlinking Linked Geospatial Data

01/06/2015 Discovering Spatial and Temporal Links among RDF Graphs 33

Spatial Entity Resolution

• [Sehgal et al. GIS’06] – Spatial and non-Spatial Criteria

– Only Location Similarity

• [Salas et al., TerraCognita’11] – Only Spatial Criteria

– Complex Geometric Similarity Methods

• [Vilches-Blázquez et al., AGILE’12] – Spatial and non-Spatial Criteria

– Simple Geometric Similarity Methods

• [Ngonga Ngomo, ISWC’13] – Spatial and non-Spatial Criteria

– Complex Geometric Similarity Methods

– Reduced number of comparisons

Page 171: ESWC2015 - Tutorial on Publishing and Interlinking Linked Geospatial Data

Outline

• Introduction to Entity Resolution and Link

Discovery

– Examples, Definitions, Common Problems

• Spatial Entity Resolution

• Spatial and Temporal Link Discovery

– Background and Developed Methods

– Extensions to the Silk Framework

– Hands-on

01/06/2015 Discovering Spatial and Temporal Links among RDF Graphs 34

Page 172: ESWC2015 - Tutorial on Publishing and Interlinking Linked Geospatial Data

Link Discovery (reminder)

01/06/2015 Discovering Spatial and Temporal Links among RDF Graphs 35

Source Source

Link Discovery is the fourth and the most important Linked Data

Principle.

Establish semantic relations between entities in order to enrich the

information that is known about them. [Bizer et al., IJSWIS’06]

Page 173: ESWC2015 - Tutorial on Publishing and Interlinking Linked Geospatial Data

Background on Spatial Relations (1/2)

• Dimensionally Extended 9-Intersection Model [Clementini et al., SSD'93]

– Captures topological relations in ℝ2, by considering the

dimension (dim) of the intersections involving the

interior (I), the boundary (B) and the exterior (E) of the

two geometries.

– Examples: Intersects, Equals, Touches, Disjoint,

Contains, Crosses, Covers, CoveredBy and Within

01/06/2015 Discovering Spatial and Temporal Links among RDF Graphs 36

Page 174: ESWC2015 - Tutorial on Publishing and Interlinking Linked Geospatial Data

Background on Spatial Relations (2/2)

• Region Connection Calculus [Randell et al. KR’92]

– RCC-8: a well-known subset of RCC, which is based on eight topological relations

– DC stands for DisConnected, EC for Externally Connected, TPP for Tangential Proper Part, NTPP, for Non Tangential Proper Part, and TPPi and NTPPi are the inverse relations of TPP and NTPP

01/06/2015 Discovering Spatial and Temporal Links among RDF Graphs 37

Page 175: ESWC2015 - Tutorial on Publishing and Interlinking Linked Geospatial Data

01/06/2015 Discovering Spatial and Temporal Links among RDF Graphs 38

Background on Temporal Relations

• Allen’s Interval Calculus [Allen, Commun. ACM’83]

– thirteen jointly exclusive and pairwise disjoint qualitative

relations

Page 176: ESWC2015 - Tutorial on Publishing and Interlinking Linked Geospatial Data

Spatial and Temporal Relations

• We consider the previous Spatial (𝑅𝑠) and Temporal (𝑅𝑡)

relations as Boolean relations (𝑅𝐵) i.e., either they hold or

they do not:

𝑅𝑠, 𝑅𝑡 ⊂ 𝑅𝐵

• 𝑅𝐵 constitutes a special subset of 𝑅. The distance function

𝑑𝑟 and the distance threshold 𝜃𝑑𝑟 for a relation 𝑟 ∈ 𝑅𝐵 are

defined as follows:

𝑑𝑟(s,t) = 0 𝑖𝑓 𝑟 ℎ𝑜𝑙𝑑𝑠1 𝑒𝑙𝑠𝑒𝑤ℎ𝑒𝑟𝑒

, 𝜃𝑑𝑟= 1

01/06/2015 Discovering Spatial and Temporal Links among RDF Graphs 39

Page 177: ESWC2015 - Tutorial on Publishing and Interlinking Linked Geospatial Data

01/06/2015 Discovering Spatial and Temporal Links among RDF Graphs 40

Spatial and Temporal Transformations

(1/2)

• CRS Transformation. The geometries of a dataset can be expressed in a Coordinate Reference System that is more precise for the geographic area that they describe (e.g., the GGRS87 for Greece). This transformation converts the CRS of a geometry to the World Geodetic System (WGS 84)

• Vocabulary Transformation. This transformation converts geometry literals from GeoSPARQL, stRDF or W3C GEO to a common vocabulary (GeoSPARQL)

• Serialization Transformation. This transformation converts the geometries of a dataset to a common serialization (WKT)

• Time-Zone Transformation. This transformation converts the time zone of a given time interval to Coordinated Universal Time (UTC)

• Period Transformation. This transformation converts a time instant to a period with the same starting and ending point

Page 178: ESWC2015 - Tutorial on Publishing and Interlinking Linked Geospatial Data

01/06/2015 Discovering Spatial and Temporal Links among RDF Graphs 41

Spatial and Temporal Transformations

(2/2)

• Simplification Transformation. Some datasets have very complex geometries, which makes the computation of spatial relations inefficient. This transformation simplifies a geometry according to a given distance tolerance, ensuring that the result is a valid geometry having the same dimension and number of components as the input

• Envelope Transformation. This transformation computes the envelope (i.e., the minimum bounding rectangle) of a geometry and it is useful in cases that we want to compute approximate spatial relations between two datasets

• Area Transformation. In some cases it is enough to compare just the areas of two geometries to infer whether they are the same or not. This transformation computes the area of a given geometry in square metres

• Points-To-Centroid Transformation. In crowdsourcing datasets like OpenStreetMap, multiple users can define the position of the same placemark. As a better approximation of the real position of this placemark we can compute the centroid of these positions. This transformation computes the centroid of a cluster of points

Page 179: ESWC2015 - Tutorial on Publishing and Interlinking Linked Geospatial Data

01/06/2015 Discovering Spatial and Temporal Links among RDF Graphs 42

Techniques for Checking the Relations

• Cartesian Product Technique (Naive) – Performs exhaustive checks between the pairs of the entities

of datasets

– Complete

– Complexity: O(|S||T|) checks

• Blocking Technique [Isele et al., WebDB’11, Papadakis et al, TKDE’13]

– Divides the entities into blocks

– Decreases the number of checks

– Complete

– Complexity: O(|S||T|) checks (worst case), O(|L|) checks (best case)

* |S|, |T|: number of entities in datasets S and T; |L|: number of links between datasets S and T

Page 180: ESWC2015 - Tutorial on Publishing and Interlinking Linked Geospatial Data

01/06/2015 Discovering Spatial and Temporal Links among RDF Graphs 43

Blocking Technique for Spatial Relations

• Divide the surface of the earth

into curved rectangles (blocks)

• Adjust the area of the blocks

with a blocking factor (bf)

(blockArea: 1

𝑏𝑓2

𝑜2

)

• If the MBB of a geometry spatially intersects with a block, then insert it in this block

• Check for a spatial relation only within each block (independently)

• Construct the set of discovered links (𝐷𝐿𝑟) by aggregating the respective links that have been discovered within each block

Page 181: ESWC2015 - Tutorial on Publishing and Interlinking Linked Geospatial Data

01/06/2015 Discovering Spatial and Temporal Links among RDF Graphs 44

Blocking Technique for Temporal

Relations

• Divide the time into

intervals (blocks)

• Adjust the length of the

blocks with a blocking factor (bf) (blockLength:

1

𝑏𝑓 𝑡𝑖𝑚𝑒 𝑢𝑛𝑖𝑡𝑠)

• If a time period or instant temporally intersects with a block, then insert it in this block

• Check for a temporal relation only within each block (independently)

• Construct the set of discovered links (𝐷𝐿𝑟) by aggregating the respective links that have been discovered within each block

Page 182: ESWC2015 - Tutorial on Publishing and Interlinking Linked Geospatial Data

01/06/2015 Discovering Spatial and Temporal Links among RDF Graphs 45

Blocking Technique

• Fully parallelizable with respect to the blocks

• Proven sound and complete

• 100% accurate links

• 100% precision, recall, F-measure

Page 183: ESWC2015 - Tutorial on Publishing and Interlinking Linked Geospatial Data

Extensions to the Silk Framework:

Spatial and Temporal Relations

01/06/2015 Discovering Spatial and Temporal Links among RDF Graphs 46

Silk

Page 184: ESWC2015 - Tutorial on Publishing and Interlinking Linked Geospatial Data

Silk

Extensions to the Silk Framework:

Spatial and Temporal Transformations

01/06/2015 Discovering Spatial and Temporal Links among RDF Graphs 47

Page 185: ESWC2015 - Tutorial on Publishing and Interlinking Linked Geospatial Data

Extensions to the Silk Framework

• Spatial and Temporal Extensions for Silk implemented as

Plugins

• Transparent to all the applications of Silk

– Single Machine

– MapReduce

– Workbench

01/06/2015 Discovering Spatial and Temporal Links among RDF Graphs 48

Silk

Page 186: ESWC2015 - Tutorial on Publishing and Interlinking Linked Geospatial Data

• Download: https://github.com/silk-framework/silk

• Workbench application pre-installed in the VM

• Discover the following links:

All the datasets will be first converted to RDF with GeoTriples!

01/06/2015 Discovering Spatial and Temporal Links among RDF Graphs 49

Hands-on Silk

Source Dataset Relation Target Dataset

Field Boundaries Contains Raster Cells

OSM Water

Bodies

Intersects Natura (2000)

Natura (2000) Within Federal States of

Germany

Page 187: ESWC2015 - Tutorial on Publishing and Interlinking Linked Geospatial Data

01/06/2015 Discovering Spatial and Temporal Links among RDF Graphs 50

References (1/3)

• [Bizer et al., IJSWIS’06]

Bizer, C., Heath, T., Berners-Lee, T.: Linked Data - The Story So Far. International

Journal on Semantic Web and Information Systems 5(3), 1–22 (2009)

• [Christen, TKDE’11]

P. Christen, " A survey of indexing techniques for scalable record linkage and

deduplication.” in IEEE TKDE 2011.

• [Auer, RW’13]

Auer, S., Lehmann, J., Ngomo, A.C.N., Zaveri, A.: Introduction to Linked Data and Its

Lifecycle on the Web. In: Rudolph, S., Gottlob, G., Horrocks, I., van Harmelen, F. (eds.)

Reasoning Web. Lecture Notes in Computer Science, vol. 8067, pp. 1–90. Springer

(2013)

• [Salas et al., TerraCognita’11]

Salas, J., Harth, A.: Finding spatial equivalences accross multiple RDF datasets. In:

Proceedings of the Terra Cognita Workshop on Foundations, Technologies and

Applications of the Geospatial Web. pp. 114–126. Citeseer (2011)

• [Sehgal et al. GIS’06]

Sehgal, V., Getoor, L., Viechnicki, P.D.: Entity resolution in geospatial data integration. In:

Proceedings of the 14th annual ACM international symposium on Advances in

geographic information systems. pp. 83–90. ACM (2006)

Page 188: ESWC2015 - Tutorial on Publishing and Interlinking Linked Geospatial Data

01/06/2015 Discovering Spatial and Temporal Links among RDF Graphs 51

References (2/3)

• [Vilches-Blázquez et al., AGILE’12]

Vilches-Blázquez, L.M., Saquicela, V., Corcho, O.: Interlinking geospatial information in

the web of data. In: Bridging the Geographic Information Sciences, pp. 119–139.

Springer (2012)

• [Ngonga Ngomo, ISWC’13]

Ngonga Ngomo, A.C.: Orchid - reduction-ratio-optimal computation of geo-spatial

distances for link discovery. In: Proceedings of ISWC 2013 (2013)

• [Clementini et al., SSD'93]

Clementini, E., Di Felice, P., van Oosterom, P.: A small set of formal topological

relationships suitable for end-user interaction. In: Abel, D., Chin Ooi, B. (eds.) Advances

in Spatial Databases, Lecture Notes in Computer Science, vol. 692, pp. 277–295.

Springer Berlin Heidelberg (1993), http://dx.doi.org/10.1007/3-540-56869-7_16

• [Randell et al. KR’92]

Randell, D.A., Cui, Z., Cohn, A.G.: A spatial logic based on regions and connection. In:

KR. pp. 165–176 (1992)

• [Allen, Commun. ACM’83]

Allen, J.F.: Maintaining knowledge about temporal intervals. Commun. ACM 26(11), 832–

843 (Nov 1983)

Page 189: ESWC2015 - Tutorial on Publishing and Interlinking Linked Geospatial Data

01/06/2015 Discovering Spatial and Temporal Links among RDF Graphs 52

References (3/3)

• [Isele et al., WebDB’11]

Isele, R., Jentzsch, A., Bizer, C.: Efficient multidimensional blocking for link discovery

without losing recall. In: WebDB. Citeseer (2011)

• [Papadakis et al, TKDE’13]

Papadakis, G., Ioannou, E., Palpanas, T., Niederée, C., Nejdl, W.: A blocking framework

for entity resolution in highly heterogeneous information spaces. Knowledge and Data

Engineering, IEEE Transactions on 25(12), 2665–2682 (2013)

Page 190: ESWC2015 - Tutorial on Publishing and Interlinking Linked Geospatial Data

01/06/2015 Discovering Spatial and Temporal Links among RDF Graphs 53

Thanks for your attention! Questions?

Page 191: ESWC2015 - Tutorial on Publishing and Interlinking Linked Geospatial Data

Transforming Natura2000 Shapefile into RDF

Kostis Kyzirakos and Dimitrianos Savva

Page 192: ESWC2015 - Tutorial on Publishing and Interlinking Linked Geospatial Data

Natura2000 (South Germany)

Page 193: ESWC2015 - Tutorial on Publishing and Interlinking Linked Geospatial Data

GeoTriples GUI

• From terminal execute: geotriples-gui

1. Connect to Natura2000 Shapefile

2. Adjust class/predicate names to ontology

3. Generate mapping

4. Dump RDF

Page 194: ESWC2015 - Tutorial on Publishing and Interlinking Linked Geospatial Data

Connect to Shapefile

Page 195: ESWC2015 - Tutorial on Publishing and Interlinking Linked Geospatial Data

GeoTriples Layout

Mapping Builder (Left)1. Adjust triples maps2. Change DataTypes3. Change Predicates

Bottom Toolbar1. Generate Mapping2. Select your preferred

geo-vocabulary3. Define CRS4. Select output format5. Dump RDF

Mapping Editor (Right)1. Change the mapping

by hand

Page 196: ESWC2015 - Tutorial on Publishing and Interlinking Linked Geospatial Data

Adjust to Ontology

Page 197: ESWC2015 - Tutorial on Publishing and Interlinking Linked Geospatial Data

Generate Mapping

Page 198: ESWC2015 - Tutorial on Publishing and Interlinking Linked Geospatial Data

Dump RDF

/home/leo/datasets/naturatriples.n3

Page 199: ESWC2015 - Tutorial on Publishing and Interlinking Linked Geospatial Data

RDF graph

Page 200: ESWC2015 - Tutorial on Publishing and Interlinking Linked Geospatial Data

Store RDF graph to Strabon

# endpoint store

http://localhost:8080/strabonendpoint

N-Triples -t

/home/leo/datasets/naturatriples.n3

Page 201: ESWC2015 - Tutorial on Publishing and Interlinking Linked Geospatial Data

Transforming OpenStreetMaps GML document into an RDF graph (1/4)

# cd ~/DEMO_ESWC15

# ./osmmapping.sh

--

geotriples-cmd generate_mapping

-o OSM/automatic-mapping.rml.ttl

-b http://data.linkedeodata.eu/waterways

-r waterways

-rp /ogr:FeatureCollection/gml:featureMember

-ns "gml|http://www.opengis.net/gml,

ogr|http://ogr.maptools.org/"

-null -onto OSM/automatic-ontology.txt

-x OSM/osm_waterways.xsd OSM/osm_waterways.gml

Page 202: ESWC2015 - Tutorial on Publishing and Interlinking Linked Geospatial Data

Transforming OpenStreetMaps GML document into an RDF graph (2/4)

# cp OSM/automatic-mapping.rml.ttl

OSM/altered-mapping.rml.ttl

# gedit OSM/altered-mapping.rml.ttl

Page 203: ESWC2015 - Tutorial on Publishing and Interlinking Linked Geospatial Data

Transforming OpenStreetMaps GML document into an RDF graph (3/4)

1. Change the class definition for the triples map <#ogr:waterwaysogr:geometryProperty>

1. Replace the class onto:LineStringPropertyTypewith ogc:Geometry

2. Change the predicate that will link the thematic data with the geometric data.

1. Find the triples map <#waterways>

2. Replace the text onto:has_geometryPropertywith ogc:hasGeometry

Page 204: ESWC2015 - Tutorial on Publishing and Interlinking Linked Geospatial Data

Transforming OpenStreetMaps GML document into an RDF graph (4/4)

# ./osmdump.sh

--

geotriples-cmd dump_rdf -rml

-o OSM/osmtriples.n3

-ns osm-namespaces.ns

OSM/altered-mapping.ttl

--

# endpoint store

http://localhost:8080/strabonendpoint N-

Triples -t

/home/leo/DEMO_ESWC15/OSM/osmtriples.n3

Page 205: ESWC2015 - Tutorial on Publishing and Interlinking Linked Geospatial Data

Store TalkingFields datasets to Strabon

# endpoint store

http://localhost:8080/strabonendpoint

N-Triples -t /home/leo/datasets/fb.n3

# endpoint store

http://localhost:8080/strabonendpoint

N-Triples -t /home/leo/datasets/rc.n3

Page 206: ESWC2015 - Tutorial on Publishing and Interlinking Linked Geospatial Data

• Download: https://github.com/silk-framework/silk

• Workbench application pre-installed in the VM

• Discover the following links:

All the datasets will be first converted to RDF with GeoTriples!

Hands-on Silk

Source Dataset Relation Target Dataset

Field Boundaries Contains Raster Cells

OSM Water

Bodies

Intersects Natura (2000)

Natura (2000) Within Federal States of

Germany

Page 207: ESWC2015 - Tutorial on Publishing and Interlinking Linked Geospatial Data

Start the Silk Workbench

Page 208: ESWC2015 - Tutorial on Publishing and Interlinking Linked Geospatial Data

Open Workspace

Page 209: ESWC2015 - Tutorial on Publishing and Interlinking Linked Geospatial Data

Import the project that you will find in the Desktop of the VM

Page 210: ESWC2015 - Tutorial on Publishing and Interlinking Linked Geospatial Data

Open the Linkage Rule

Page 211: ESWC2015 - Tutorial on Publishing and Interlinking Linked Geospatial Data

Modify the Linkage Rule

Page 212: ESWC2015 - Tutorial on Publishing and Interlinking Linked Geospatial Data

Start the Link Generation

Page 213: ESWC2015 - Tutorial on Publishing and Interlinking Linked Geospatial Data

Examing Generated Links

$ less

/home/leo/Desktop/FieldBounda

riesRasterCellsLinks.nt

Page 214: ESWC2015 - Tutorial on Publishing and Interlinking Linked Geospatial Data

• Download: https://github.com/silk-framework/silk

• Workbench application pre-installed in the VM

• Discover the following links:

All the datasets will be first converted to RDF with GeoTriples!

Hands-on Silk

Source Dataset Relation Target Dataset

Field Boundaries Contains Raster Cells

OSM Water

Bodies

Intersects Natura (2000)

Natura (2000) Within Federal States of

Germany