Publishing research information as Linked Data Proposal of Recommendations

Publishing research information as Linked Data: Proposal of Recommendations

Publishing research information as Linked DataProposal of RecommendationsEuroCRIS meeting. February 2012

Miguel-ngel Sicilia1ROADMAPIntroduction & MotivationStakeholdersExample ArchitectureBasic Principles of the LD ExposureCERIF OntologyRecipes for the CERIF LD ExposureCERIF Model ExtensionKey Use CaseDemoBootstrappingIssues and challengesConclusions

Introduction & motivationA point of departure?CERIF and Linked Data are similar, complementary approaches. However, there are significant differences in the way they encode relationships. EXRI-UK reviewed these approaches against higher education needs and recommended that CERIF should be the basis for the exchange of research information in the UK. CERIF is currently better able to encode the rich information required to communicate research information, and has the organisational backing of EuroCRIS, ensuring it is well-managed and sustainable.EXRI-UK final report,

An example of using linked data in RIS

XML data interchangeRIS Database (CERIF)RIS Database (CERIF)

generateparsesend/receptionLimitations (from a linked data perspective) WebAPIAAggregator (harvester or query client)ShortcomingsAPIs provide proprietary interfaces (even though CERIF XML standardizes the interchange format)Aggregators are based on a fixed set of data sources. (not necessarily, but require some registry of providers).You can not set hyperlinks neither between RIS entities (projects, people, organizations, publications) descriptions nor from them to other data or terminologies.WebAPIBWebAPICWebAPIDAdapted from: Christian Bizer: The Web of Linked Data (26/07/2009)The linked data approachAdapted from: Christian Bizer: The Web of Linked Data (26/07/2009)BCRDFRDFlinkADDBpediaRDFlinksRDFlinksRDFlinksRDFRDFRDFRDFRDFRDFRDFRDFRDFUse RDF to provide CERIF metadata based on the XML mappingAdd links using different kinds of relations rel (mapping of CERIF link entities?).Connect to terminologies using some Classification (cls). (an extension of keywords in CERIF?)Link to other LOD datasets instead of repeating information.clsrelclsTerminology serverBrowsing & integratingAdapted from: Christian Bizer: The Web of Linked Data (26/07/2009)BCRItypedlinksADEtypedlinkstypedlinkstypedlinksRITermTermRIRIRIRITermTermData integrator (combines information for a given cfPers, cfProj or cfOrgUnit)BrowserData integrator (combines Information of several cfPers, cfProj or cfOrgUnit, e.g. for analyzing country or call outcomes)Relevant recommendations

CERIF componentsstakeholdersA proposalHigher Education Institutions (HEI) or R&D institutionsFunding bodies (FB)Research Authorities (RA)ResearchersResearch information Enterprises (RIE)General publicEnterpriseswhich are their critical use cases and their killer apps?

Example ArchitectureStrategies for publish linked dataALTERNATIVES FOR THE EXPOSURE OF LINKED DATAProviding a endpoint for enquiriesServing Static RDF FilesServing RDF Embedded in HTML FilesServing LD from RDF Triple StoresServing LD by wrapping Web APIsServing LD from Relational Databases

FACTORS AFFECTING THE DECISIONHow much data do you want to serve? How is your data currently stored? How often does your data change? RIS architectureInternet NavigatorProjectPAPERSLinked Data-The Story So Far[PDF] de igeex.bizT Berners-Lee - International Journal on Semantic Web and , 2009 - igi-global.comCitado por 294 - Artculos relacionados - Las 19 versiones - Importar al BibTeXBack!URL: http://cris.myorganization.orgFile Favourites Help

RIS Database (CERIF)

RIS Application Server

RIS-LD architecture

Internet NavigatorProjectPAPERSrelacionados - Las 19 versiones - Importar al BibTeXBack!File Faourites Help

a cerif:Project ; rdfs:label "Multilingual Federation of Learning Repositories"@en-uk ; cerif:acronym "Organic.Edunet" ; cerif:endDate "2010-09-30"^^xsd:date ; cerif:internalIdentifier "ff808181300cf99e01300d1a355f0003" cerif:isLinkedByOrganisationUnit RIS Database (CERIF)D2R Server

RIS Application Server

URI scheme published by D2R Identifier for a given resource description of a given resource in RDF (N3) description of a given resource in HTML

Opening our cerif datasets

Internet NavigatorURL: http://mashup.orgFile Favourites Help


a cerif:Project ; rdfs:label "Multilingual Federation of Learning Repositories"@en-uk ; cerif:acronym "Organic.Edunet" ; cerif:endDate "2010-09-30"^^xsd:date ; cerif:internalIdentifier "ff808181300cf99e01300d1a355f0003" cerif:isLinkedByOrganisationUnit RIS-LD

Benefits of our architectureExposure of Liked Data without altering the current research information system (non-intrusive)Linked Data interface: RDF descriptions of individual resources stored in DB over the HTTP protocolSPARQL endpoint (the SQL of Linked Data)Traditional HTML interface: web pages describing resourcesSimple way of interchanging data on the WebCreate new third party applications using open linked data from RIS systems

Basic Principles of the LD ExposureGeneral Principles of the LOD approachUse URIs as names for things.Use HTTP URIs so that people can look up those names. When someone looks up a URI, provide useful information, using the standards (RDF*, SPARQL) Include links to other URIs. so that they can discover more things. Re-using of well-known termsWe need an ontology for the CERIF model elements "Do not reinvent the wheel" Data can be consumed by applications that may be tuned to well-known vocabulariesFoster interoperability between different datasets

self-described and consistent termsLogical entities are translated into RDF classes and their attributes into RDF propertiesCF prefixes are not necessary for ontology termsInstead, URI namespacesProperties and Classes self-described rdfs:label (title case capitalized version of the property/class)rdfs:comment (a plain text description of the

URI DesignEssential to enable interoperability and understanding Create human-readable and memorable URIsAvoid using artificial primary keysDiscover URIs using similarity heuristicsFollow a similar schema/pattern for URIs for a identifier for the EU project Virtual Open Access ... hosted at University of Athens

Where do RI datasets live?Higher ed or R&D institutions maintain repositories centred on Pers, cfOrgUnit (internal) and sometimes cfProj and emphasizing cfResPubl, cfResPat.Funding bodies are centred around cfProj, cfOrgUnit (mostly legal bodies, not internal) and cfFundProg and related.Bibliographic and citation databases focus on cfResPubl, cfResPat and in general provide poor support for cfPers and cfOrgUnit.

Distributed datasetsResearch Information is distributedFrequently, there is duplicated information in different RIS systems.ID for VOA3R Project in the University of Athens dataset* for VOA3R Project in the University of Alcal dataset* Problem: a same concept can be identified by different URIs in Linked DataUsing owl:sameAs predicate* Assuming that there is a corporate RIS available in http://cris.....CERIF OntologyCERIF OntologyCERIF Ontology

CERIF Semantic VocabularyOther vocabularies EuroCRIS website for publishing ontologies

Current version at CERIF ontology on the WEB

Visual Representation of the ontologies

Current version at terms

Recipes for the CERIF LD ExposureRecipes for the CERIF LD ExposureMULTIPLE LANGUAGE FEATURESCERIF Multiple Language Features

PredicateObjectrdfs:labelfoaf:namecerif:name*.cfName rdfs:labeldc:titlecerif:title*.cfTitledc:description*.cfDescrcerif:keyworddc:subject*.cfKeywdcterms:abstractcerif:abstract*.cfAbstractcerif:researchActivitiescfOrgUnit.cfResActcerif:researchInterestscfPers.cfResIntdcterms:alternativecfResPublSubtitle.cfSubtitlefoaf:namecfResPublNameAbbrev.cfNameAbbrevbibo:annotatescfResPublBiblNote.cfBiblNoteRecipes for the CERIF LD ExposureSEMANTIC FEATURESCERIF Semantics document

From a PDF document with the CERIF semanticsCERIF Semantic vocabulary

Current version at a RDF Vocabulary with the roles and classification termsCERIF using external vocabularies.The predicates cerif:classification and cerif:role enable to use external vocabularies to enrich our dataCERIF OntologyCERIF Semantic VocabularyOther vocabulary NOther vocabulary 1...cerif:classificationcerif:roleRecipes for the CERIF LD ExposureADDITIONAL FEATURESCERIF additional featuresThe current CERIF model contains Dublin Core and Formalised Dublin Core entities and attributes. We will use external vocabularies through cerif:role and cerif:classification properties Avoiding the need of storing and publishing entities related to any terminology.Recipes for the CERIF LD ExposureBASE ENTITIESCERIF base entity PROJECTProject Acronym (cfProj.cfAcro) will be part of the resource identifier (ID)

PredicateObjectrdf:type cerif:Projectcerif:internalIdentifiercfProj.cfPersIdcerif:startDatecfProj.cfStartDatecerif:endDatecfProj.cfEndDa