24
1 Het begint met een idee GRLC MAKES GITHUB TASTE LIKE LINKED DATA APIS Chefs Albert Meroño- Peñuela Rinke Hoekstra Services and Applications over Linked APIs and Data (SALAD) ESWC 29-05-2016

grlc Makes GitHub Taste Like Linked Data APIs

Embed Size (px)

Citation preview

Page 1: grlc Makes GitHub Taste Like Linked Data APIs

1 Het begint met een idee

GRLC MAKES GITHUB TASTE LIKE LINKED DATA APIS

Chefs Albert Meroño-Peñuela Rinke Hoekstra

Services and Applications over Linked APIs and Data (SALAD)ESWC29-05-2016

Page 2: grlc Makes GitHub Taste Like Linked Data APIs

Vrije Universiteit Amsterdam

VU University Amsterdam – Computer Science (Knowledge Representation & Reasoning group)

International Institute of Social History (IISG), Amsterdam

CLARIAH – National Infrastructure for Digital Humanities> DataLegend : Structured Data Hub

Previously incubated by CEDAR – Dutch historical censuses as 5-star LOD

2

INSTITUTIONAL SLIDE

Page 3: grlc Makes GitHub Taste Like Linked Data APIs

3 Het begint met een idee

DISCLAIMER

3

Frustration-driven research

Page 4: grlc Makes GitHub Taste Like Linked Data APIs

4 Het begint met een idee

1. LD-CONSUMING APPLICATIONS

4

Page 5: grlc Makes GitHub Taste Like Linked Data APIs

5 Het begint met een idee5 Het begint met een idee

Publishing Dutch historical censuses as 5-star LD> Intensive use of RDF Data Cube> Harmonization rules> Provenance

1st historical census data as Linked Data (1795-1971)

8 million observations (sex, marital status, occupation position, housing type, residence status)

External links> Geographical: 2.7M> Occupations: 350K> Belief: 250K

High value for social historians5 Faculty / department / title presentation

THE CEDAR STORY

Page 6: grlc Makes GitHub Taste Like Linked Data APIs

Vrije Universiteit Amsterdam

Historians can’t really write SPARQL Variety of access interfaces needed

6

CENSUS DATA QUERYING INTERFACES

Page 7: grlc Makes GitHub Taste Like Linked Data APIs

Vrije Universiteit Amsterdam

CLARIAH-WP4: Structured data hub for social historians

IPUMS, NAPP, CEDAR, etc> Macro-, micro-, meso-data> Civil registries, occupation, religion,

country-level economic indicators> National (Netherlands) and

international Mostly CSV tables turned

into RDF Data Cube and CSVW

More than 1B triples already Higher variety of humanities

scholars higher variety of data access requirements)

7

SCALING VARIETY

Page 8: grlc Makes GitHub Taste Like Linked Data APIs

8 Het begint met een idee8

Page 9: grlc Makes GitHub Taste Like Linked Data APIs

9 Het begint met een idee

FRUSTRATION 1

9

This is SPARQL mess!!!1one

Page 10: grlc Makes GitHub Taste Like Linked Data APIs

10 Het begint met een idee

Page 11: grlc Makes GitHub Taste Like Linked Data APIs

11 Het begint met een idee11 Het begint met een idee

One .rq file for SPARQL query Good support of query curation

processes> Versioning> Branching> Clone-pull-push

Web-friendly features!> One URI per query> Uniquely identifiable> De-referenceable

(raw.githubusercontent.com)

11 Faculty / department / title presentation

GITHUB AS A HUB OF SPARQL QUERIES

Page 12: grlc Makes GitHub Taste Like Linked Data APIs

12 Het begint met een idee

LESSON 1

12

Query centralization helps maintaining distributed applications

Page 13: grlc Makes GitHub Taste Like Linked Data APIs

13 Het begint met een idee

2. THE NEED FOR APIS

13

Page 14: grlc Makes GitHub Taste Like Linked Data APIs

Vrije Universiteit Amsterdam

Linked Data APIs emerge RESTful entry point to Linked Data hubs for Web applications OpenPHACTS

…but the Linked Data API (e.g. Swagger spec, code itself) still needs to be coded and maintained

14

MEANWHILE IN THE SEMANTIC WEB…

Page 15: grlc Makes GitHub Taste Like Linked Data APIs

Vrije Universiteit Amsterdam

Love story – thanks KMi! Automatically builds Swagger

specs and API code Takes SPARQL queries as input

(1 API operation = 1 SPARQL query)> API call functionality limited to SPARQL

expressivity Makes SPARQL queries uniquely

referenceable by using their equivalent LDA operation> Stores SPARQL internally> But we already have uniquely

referenceable SPARQL…

15

BASIL

Page 16: grlc Makes GitHub Taste Like Linked Data APIs

16 Het begint met een idee

FRUSTRATION 2

16

Copy-pasting 200 queries!!!&Organization problem

Page 17: grlc Makes GitHub Taste Like Linked Data APIs

17 Het begint met een idee17 Het begint met een idee

Cousin of BASIL in a SALAD Same basic principle: 1 SPARQL

query = 1 API operation Automatically builds Swagger spec

and UI from SPARQL

But: External query management Organization of SPARQL queries in

the GitHub repo matches organization of the API

Thin layer – nothing stored server-side

Maps> GitHub API> Swagger spec

17 Faculty / department / title presentation

Page 18: grlc Makes GitHub Taste Like Linked Data APIs

Vrije Universiteit Amsterdam18

MAPPING GITHUB AND SWAGGER

Page 19: grlc Makes GitHub Taste Like Linked Data APIs

Vrije Universiteit Amsterdam

19

SPARQL DECORATOR SYNTAX

Page 20: grlc Makes GitHub Taste Like Linked Data APIs

Vrije Universiteit Amsterdam

20

THE GRLC SERVICE

Assuming your repo is at https://github.com/:owner/:repo and your grlc instance at :host,

> http://:host/:owner/:repo/spec returns the JSON swagger spec> http://:host/:owner/:repo/api-docs returns the swagger UI> http://:host/:owner/:repo/:operation?p_1=v_1...p_n=v_n calls

operation with specifiec parameter values> Uses BASIL’s SPARQL variable name convention for query parameters

Sends requests to> https://api.github.com/repos/:owner/:repo to look for SPARQL queries and their

decorators> https://raw.githubusercontent.com/:owner/:repo/master/file.rq to dereference

queries, get the SPARQL, and parse it

Page 21: grlc Makes GitHub Taste Like Linked Data APIs

Vrije Universiteit Amsterdam

21

SPICED-UP SWAGGER UI

Page 22: grlc Makes GitHub Taste Like Linked Data APIs

Vrije Universiteit Amsterdam

22

EVALUATION – USE CASES

CEDAR: Access to census data for historians> Hides SPARQL> Allows them to fill query parameters

through forms> Co-existence of SPARQL and non-SPARQL

clients CLARIAH - Born Under a Bad

Sign: Do prenatal and early-life conditions have an impact on socioeconomic and health outcomes later in life? (uses 1891 Canada and Sweden Linked Census Data)> Reduction of coupling between SPARQL

libs and R> Shorter R code – input stream as CSV

Page 23: grlc Makes GitHub Taste Like Linked Data APIs

Vrije Universiteit Amsterdam

The spectrum of Linked Data clients: SPARQL intensive applications vs RESTful API applications

grlc uses decoupling of SPARQL from all client applications (including LDA) as a powerful practice

Separates query curation workflows from everything else Allows at the same time

> Web-friendly SPARQL queries> Web-friendly RESTful APIs

Helps you to easily organise your LDA – just organise your SPARQL repository and you’re set

Try it out!> http://grlc.clariah-sdh.eculture.labs.vu.nl> https://github.com/CLARIAH/grlc 23

CONCLUSIONS

Page 24: grlc Makes GitHub Taste Like Linked Data APIs

24 Het begint met een idee

THANK YOU!

@ALBERTMERONYO

DATALEGEND.NETCLARIAH.NL

24