Towards a Linked Data Publishing Methodology

Embed Size (px)

Text of Towards a Linked Data Publishing Methodology

  • Berner Fachhochschule | Haute cole spcialise bernoise | Bern University of Applied Sciences

    Towards a Linked Data Publishing Methodology

    Krems, 18 Mai 2016

    Eduard Klein / E-Government Institute

    BUAS Bern University of Applied Sciences / Faculty of Business

    Context: EC funded "Fusepool"

    projects (2012-2016)

  • Berner Fachhochschule | Haute cole spcialise bernoise | Bern University of Applied Sciences

    Lecturer @ BUAS Bern Univ. of Applied Sciences

    Teaching areas: Software Engineering, (Semantic) Web Technology,

    Linked (Open) Data

    Research projects since 2009, EC funded (FP7)

    Ambient Assisted Living (AAL call): Third Age Online (TAO), 2009-2011: digital

    inclusion of senior citizens (web accessibility)

    Fusepool SME, 2012-2014: integrating and semantically enriching

    heterogenous data for product development and protection of intellectual

    property

    Fusepool P3, 2014-2016: efficient data publishing through a highly automated

    data life cycle and tooling support in tourism use cases (Tuscany, Trentino)

    The Fusepool projects are Linked Open Data (LOD) projects

    Eduard Klein / Profile

  • Linked Data (LD) approach in general (at least in many

    projects)

    Realization within a large-scale research project ("Fusepool")

    Focus on Methodology

    Analysis and Externalization

    Practical Template for use in LD projects

    Experiences

    Outline

    3

  • Linked Data (LD) Approach in General

    4

    Events POIsGLAM

    Data

    Linked Data:

    enriched and integrated

    Legacy Data:

    lod-cloud.net

    - Tools & Techniques

    - LD life cycle

    - Ease of use?

    - Sustainability?

    http://lod-cloud.net/

  • Sustainability:

    Compliant to W3C's Linked Data Platform (LDP)

    Loosely coupled components (RESTful API)

    Architecture: Linking Data in "Fusepool"

    5

    RDF Triple Store

    Custom

    Services

    SPARQL

    Endpoint

    LDP 1.0

    Server

    R R

    Transformer API

    Pipeline Transfomer

    Single Transfomers

    LDP Transforming Proxy

    RLDP 1.0

    Client Applications / Fusepool P3 DashboardClient Applications / Fusepool P3 Dashboard

    Client Applications / Fusepool P3 Dashboard

    R Transforming Container API

    (Extension of LDP 1.0)

    RSPARQL

    RREST

    User Interaction

    Request Registry

    Transformer

    Registry

    Transformer

    Factory Registry

    R

    User Interaction

    Request API

    Key

    Clients

    Transformers

    LDP Transforming Proxy

    Backends

  • Focus on Re-Use of Linked Data Publishing Process:

    Suitable starting point for LD Use Case planning

    Completeness (of planning), necessary project skills,

    duration of completed projects

    Documentation of essential tasks helps answering:

    "How long will it take to develop a use case with this

    platform?"

    "Necessary technical skills?"

    Shortening the learning curve

    Better estimation of future projects based on documented

    experiences

    Methodology

    6

  • Linked Data Publishing Methodology (LIDAPUME)

    7

  • 1)Typical key stakeholders: end users and data owners

    2) Interviews and (field) research

    3)Starting point for conceptual and functional test models

    4) Identifying (available and missing) data sources

    5) Identification of appropriate taxonomies, vocabularies,

    ontologies

    6)Specify mapping of non-RDF data to RDF data (e.g. through

    XSLT)

    7)Definition of transformation steps (Fusepool: configuration

    front-end)

    Linked Data Publishing Methodology (cont.)

    8

  • A template for documentation of essential

    activities

    9

    Shape of template heavily discussed,

    e.g. "too general?" "too unspecific?"

  • Archival Data from the Federal Archives of 4 Swiss cantons

    SPARQL endpoint

    Validation of the framework ("Swiss Archive" Use Case)

    10

    (1D=1 effort-day)

  • FU Berlin library content

    GND and Dbpedia loaded and pre-processed

    Validation of the framework ("Library Keyword Clustering")

    11

  • Events of touristic regions

    Interlinked with POIs, historical characters etc.

    Hackathon outcome: LD based web application

    Validation of the framework ("Event Explorer")

    12

  • Several adaptions of 7-step model through project phases

    Number of phases

    Type and granularity of documented information (detail of

    description)

    Formalism of notation: not too formal, semi-structured

    Detail of description: too detailed would not be of value for

    "average" user

    Columns "Activities", "Skills", "Effort" helped a lot for planning

    of future Linked Data Use Cases

    Experiences with Methodology & Template

    13

  • Evaluation and Validation of Publishing Framework LIDAPUME

    and Template

    In our ongoing and future projects

    (hopefully) by other projects

    possible evolution of methodology based on evaluation and

    validation feedback

    Comparability of approaches based on documentation with the

    same template

    Outlook / Further Research

    14