14
Towards a Linked Data Publishing Methodology Krems, 18 Mai 2016 Eduard Klein / E-Government Institute BUAS – Bern University of Applied Sciences / Faculty of Business Context: EC funded "Fusepool" projects (2012-2016)

Towards a Linked Data Publishing Methodology

Embed Size (px)

Citation preview

Page 1: Towards a Linked Data Publishing Methodology

Berner Fachhochschule | Haute école spécialisée bernoise | Bern University of Applied Sciences

Towards a Linked Data Publishing Methodology

Krems, 18 Mai 2016

Eduard Klein / E-Government Institute

BUAS – Bern University of Applied Sciences / Faculty of Business

Context: EC funded "Fusepool"

projects (2012-2016)

Page 2: Towards a Linked Data Publishing Methodology

Berner Fachhochschule | Haute école spécialisée bernoise | Bern University of Applied Sciences

▶ Lecturer @ BUAS – Bern Univ. of Applied Sciences

▶ Teaching areas: Software Engineering, (Semantic) Web Technology,

Linked (Open) Data

▶ Research projects since 2009, EC funded (FP7)

▶ Ambient Assisted Living (AAL call): Third Age Online (TAO), 2009-2011: digital

inclusion of senior citizens (web accessibility)

▶ Fusepool SME, 2012-2014: integrating and semantically enriching

heterogenous data for product development and protection of intellectual

property

▶ Fusepool P3, 2014-2016: efficient data publishing through a highly automated

data life cycle and tooling support in tourism use cases (Tuscany, Trentino)

▶ The Fusepool projects are Linked Open Data (LOD) projects

Eduard Klein / Profile

Page 3: Towards a Linked Data Publishing Methodology

▶ Linked Data (LD) approach in general (at least in many

projects…)

▶ Realization within a large-scale research project ("Fusepool")

▶ Focus on Methodology

▶ Analysis and Externalization

▶ Practical Template for use in LD projects

▶ Experiences

Outline

3

Page 4: Towards a Linked Data Publishing Methodology

Linked Data (LD) Approach in General

4

Events POIsGLAM

Data

Linked Data:

enriched and integrated

Legacy Data:…

lod-cloud.net

- Tools & Techniques

- LD life cycle

- Ease of use?

- Sustainability?

Page 5: Towards a Linked Data Publishing Methodology

▶ Sustainability:

▶ Compliant to W3C's Linked Data Platform (LDP)

▶ Loosely coupled components (RESTful API)

Architecture: Linking Data in "Fusepool"

5

RDF Triple Store

Custom

Services

SPARQL

Endpoint

LDP 1.0

Server

R R

Transformer API

Pipeline Transfomer

Single Transfomers

LDP Transforming Proxy

RLDP 1.0

Client Applications / Fusepool P3 DashboardClient Applications / Fusepool P3 Dashboard

Client Applications / Fusepool P3 Dashboard

R Transforming Container API

(Extension of LDP 1.0)

RSPARQL

RREST

User Interaction

Request Registry

Transformer

Registry

Transformer

Factory Registry

R

User Interaction

Request API

Key

Clients

Transformers

LDP Transforming Proxy

Backends

Page 6: Towards a Linked Data Publishing Methodology

▶ Focus on Re-Use of Linked Data Publishing Process:

▶ Suitable starting point for LD Use Case planning

▶ Completeness (of planning), necessary project skills,

duration of completed projects

▶ Documentation of essential tasks helps answering:

▶ "How long will it take to develop a use case with this

platform?"

▶ "Necessary technical skills?"

▶ Shortening the learning curve

▶ Better estimation of future projects based on documented

experiences

Methodology

6

Page 7: Towards a Linked Data Publishing Methodology

Linked Data Publishing Methodology (LIDAPUME)

7

Page 8: Towards a Linked Data Publishing Methodology

1)Typical key stakeholders: end users and data owners

2) Interviews and (field) research

3)Starting point for conceptual and functional test models

4) Identifying (available and missing) data sources

5) Identification of appropriate taxonomies, vocabularies,

ontologies

6)Specify mapping of non-RDF data to RDF data (e.g. through

XSLT)

7)Definition of transformation steps (Fusepool: configuration

front-end)

Linked Data Publishing Methodology (cont.)

8

Page 9: Towards a Linked Data Publishing Methodology

A template for documentation of essential

activities

9

▶ Shape of template heavily discussed,

e.g. "too general?" "too unspecific?"

Page 10: Towards a Linked Data Publishing Methodology

▶ Archival Data from the Federal Archives of 4 Swiss cantons

▶ SPARQL endpoint

Validation of the framework ("Swiss Archive" Use Case)

10

(1D=1 effort-day)

Page 11: Towards a Linked Data Publishing Methodology

▶ FU Berlin library content

▶ GND and Dbpedia loaded and pre-processed

Validation of the framework ("Library Keyword Clustering")

11

Page 12: Towards a Linked Data Publishing Methodology

▶ Events of touristic regions

▶ Interlinked with POIs, historical characters etc.

▶ Hackathon outcome: LD based web application

Validation of the framework ("Event Explorer")

12

Page 13: Towards a Linked Data Publishing Methodology

▶ Several adaptions of 7-step model through project phases

▶ Number of phases

▶ Type and granularity of documented information (detail of

description)

▶ Formalism of notation: not too formal, semi-structured

▶ Detail of description: too detailed would not be of value for

"average" user

▶ Columns "Activities", "Skills", "Effort" helped a lot for planning

of future Linked Data Use Cases

Experiences with Methodology & Template

13

Page 14: Towards a Linked Data Publishing Methodology

▶ Evaluation and Validation of Publishing Framework LIDAPUME

and Template

▶ In our ongoing and future projects

▶ (hopefully) by other projects

▶ possible evolution of methodology based on evaluation and

validation feedback

▶ Comparability of approaches based on documentation with the

same template

Outlook / Further Research

14