Upload
valentine-charles
View
510
Download
0
Embed Size (px)
Citation preview
Linked Data for European Cultural Heritage: the Europeana approachValentine Charles | ISSN2016
Netherlands, Public Domain1660 - 1625, Rijksmuseum
AnonymousArrival of a Portuguese ship
Title hereCC BY-SA
Europeana?
Enriching Cultural Heritage Data with DBpedia
CC BY-SA
Europeana Collections homepageEuropeana| CC BY-SA
Europeana EssentialsCC BY-SACC BY-SA
Linked Data for European Cultural Heritage: the Europeana approach
Title hereCC BY-SA
Title hereCC BY-SA
Europeana aggregation infrastructureEuropeana| CC BY-SA
Europeana?
Europeana EssentialsCC BY-SACC BY-SA
Linked Data for European Cultural Heritage: the Europeana approach
Europeana and its many data challenges
Enriching Cultural Heritage Data with DBpediaCC BY-SA
We aggregate very heterogeneous metadata
• More than 50M objects• 3,500 galleries, libraries, archives and museums• 50 languages• From all EU countries• Level of quality varies greatly
Europeana EssentialsCC BY-SACC BY-SA
Linked Data for European Cultural Heritage: the Europeana approach
Europeana and the challenge of identifiers A manifold issue:
• Europeana needs to manage identifiers within its data aggregation workflow to
• Make sure identifiers are persistent• Limit the amount of duplicates
• Europeana collects a huge number of references to places, agents, concepts, time
• But most of them are represented as simple text strings
Europeana EssentialsCC BY-SACC BY-SA
Linked Data for European Cultural Heritage: the Europeana approach
Enriching Cultural Heritage Data with DBpediaCC BY-SA
Europeana EssentialsCC BY-SACC BY-SA
Linked Data for European Cultural Heritage: the Europeana approach
Title hereCC BY-SA
Title hereCC BY-SA
Linked Open Data
Europeana Linked Open Data video on VimeoEuropeana | CC BY-SA
Europeana EssentialsCC BY-SACC BY-SA
Linked Data for European Cultural Heritage: the Europeana approach
CC BY-SA
Europeana Data Model (EDM)• Re-uses several existing Semantic Web-based
models: Dublin Core, OAI-ORE, SKOS, CIDOC-CRM…
• More granular metadata• links e.g. between objects and context
entities (persons, places)
Building a framework for semantic cultural heritage data Europeana Essentials
CC BY-SACC BY-SALinked Data for European Cultural Heritage: the Europeana
approach
A data model to represent richer (linked) data
Create a “semantic layer” on top of cultural heritage objects
CC BY-SA
• Include multilingual “value vocabularies”• From Europeana’ s providers or from third-party data sources
Building a framework for semantic cultural heritage data Europeana Essentials
CC BY-SACC BY-SALinked Data for European Cultural Heritage: the Europeana
approach
STRATEGIES FOR GETTING LINKED DATA
Concours de cycles nautiques sur le lac d’Enghien : Berregent piloté par Austerling
1914, National Library of FranceAgence de presse Meurisse
France, Public Domain
Encourage data providers to contribute their own vocabularies
Enriching Cultural Heritage Data with DBpediaCC BY-SA
Europeana EssentialsCC BY-SACC BY-SA
Linked Data for European Cultural Heritage: the Europeana approach
• Benefit from data links made at data providers’ level
• Ingestion of vocabularies is made possible if the vocabularies used the data structures EDM expects
• For instance SKOS for Concepts• For other vocabularies, Europeana does custom
mappings
Semantic enrichment, a solution for better quality data? Automatic and manual enrichment are more and more commonly used in digital libraries to:
• normalise data
• “standardize data” by linking it to authority resources
• improve multilingual coverage in datasets
• contextualise resources
• contribute to build a web of data ('knowledge graph') that third parties can use to improve their users' experience
Enriching Cultural Heritage Data with DBpediaCC BY-SA
Europeana EssentialsCC BY-SACC BY-SA
Linked Data for European Cultural Heritage: the Europeana approach
The main components of semantic enrichment
source objects whose metadata is being enriched set of resources
used to enrich the source metadata
targets can be of different types, from simple uncontrolled strings to resources published as LOD
specify how the enrichment between the source and target should be executed.
Source
Target
Rules
Enriching Cultural Heritage Data with DBpediaEuropeana EssentialsCC BY-SACC BY-SA
Linked Data for European Cultural Heritage: the Europeana approach
Automatic enrichment process in Europeana
selection of metadata fields in descriptions
selection of potential rules to match
matching the values of the metadata fields to values of the contextual resources
adding contextual links
selection of values from the contextual resource
values go into the search index
Analysis
Linking
Augmentation of search index
Enriching Cultural Heritage Data with DBpediaEuropeana EssentialsCC BY-SACC BY-SA
Linked Data for European Cultural Heritage: the Europeana approach
Building a framework for semantic cultural heritage data Europeana EssentialsCC BY-SACC BY-SA
Linked Data for European Cultural Heritage: the Europeana approach
Vocabularies we currently enrich metadata with
Enriching Cultural Heritage Data with DBpedia
Entity Class
Target vocabulary Size Metadata Fields subject of
Enrichment
Places GeoNames 140,097
dcterms:spatial, dc:coverage
Concepts DBpedia 5,284 dc:subject, dc:type
GEMET 280
Agents DBpedia 161,209
dc:creator, dc:contributor
Time Semium Time
2,566 dc:coverage, dcterms:temporal, dc:date, edm:year
Europeana EssentialsCC BY-SACC BY-SA
Linked Data for European Cultural Heritage: the Europeana approach
Some challenges are harder than others !
France, Public Domain1588, Bibliothèque municipale de Lyon
Hendrik Goltzius
Le dragon dévorant les compagnons de Cadmus
Evaluation of metadata enrichment practices in digital libraries: steps towards better data enrichments
Poisonous India or the Importance of a Semantic and Multilingual Enrichment StrategyMarlies Olensky, Juliane Stiller, Evelyn Dröge, MTSR 2012 http://link.springer.com/chapter/10.1007%2F978-3-642-35233-1_25
Challenges of multilingual automatic enrichment
Available solutions: Define recommendations for semantic enrichment including
CC BY-SAEnriching Cultural Heritage Data with
DBpedia
• the selection of the targets datasets• the selection of the tool and method• the monitoring and evaluation of the enrichment
and its output
• http://bit.ly/Evaluation-Enrichment
Europeana EssentialsCC BY-SACC BY-SA
Linked Data for European Cultural Heritage: the Europeana approach
Building an ecosystem of networked references
see http://cultuurlink.beeldengeluid.nlEuropeana Essentials
CC BY-SACC BY-SALinked Data for European Cultural Heritage: the Europeana
approach
Available solutions: Encouraging alignments between vocabularies
Available solutions: An “Entity collection” for Europeana
• As a cornerstone for our strategy we are building an "Entity Collection"• A service that acts as a centralized point of reference
and access to data about contextual entities• Caching and curating data from the wider Linked Open
Data cloud• A sort of Europeana "knowledge graph"
Europeana EssentialsCC BY-SACC BY-SA
Linked Data for European Cultural Heritage: the Europeana approach
The Entity CollectionUse Cases
CC BY-SA
Building an ecosystem of networked references
Europeana Collections PortalFindability: users can look for entitiesUnderstandability: Entity Pages group and present all assertions about an entityExploration: Navigation along relationships becomes possible
CrowdsourcingObjects can be annotated with references to entitiesA controlled vocabulary for client applications
Enrichment of Provider’s DataA controlled vocabulary to help identify named references to entities
Republication for Re-useEntities can be republished as an open source to the community
Europeana EssentialsCC BY-SACC BY-SA
Linked Data for European Cultural Heritage: the Europeana approach
The Entity CollectionDBpedia resource for “Mozart” in our data
CC BY-SA
Building an ecosystem of networked references
<http://dbpedia.org/resource/Wolfgang_Amadeus_Mozart> a <http://www.europeana.eu/schemas/edm/Agent> ; skos:prefLabel " वोल्फ़गांक आमडेयुस मोत्सार्ट�"@hi, "Волфганг Амадеус Моцарт"@sr, "Волфганг Амадеус Моцарт"@mk, "Волфганг Амадеус Моцарт"@bg; skos:altLabel "Mozart, Johann Chrysostom Wolfgang Amadeus"@en, "Mozart, Wolfgang Amadeus"@en ; dc:identifier "32197206" ; rdaGr2:placeOfBirth <http://dbpedia.org/resource/Austria>, <http://dbpedia.org/resource/Salzburg> ; rdaGr2:placeOfDeath <http://dbpedia.org/resource/Vienna>, <http://dbpedia.org/resource/Austria> ; edm:end "1791-12-05" ; rdaGr2:dateOfDeath "1791-12-05" ; rdaGr2:dateOfBirth "1756-01-27" ; rdaGr2:biographicalInformation … ; owl:sameAs <http://en.dbpedia.org/resource/Wolfgang_Amadeus_Mozart>, <http://zitgist.com/music/artist/b972f589-fb0e-474e-b64a-803b0364fa75>,<http://wikidata.org/entity/Q254>, <http://www4.wiwiss.fu-berlin.de/gutendata/resource/people/Mozart_Wolfgang_Amadeus_1756-1791>, <http://sw.cyc.com/concept/Mx4rvlD9e5wpEbGdrcN5Y29ycA>, <http://yago-knowledge.org/resource/Wolfgang_Amadeus_Mozart>, <http://wikidata.dbpedia.org/resource/Q254>, <http://rdf.freebase.com/ns/m.082db> … .
Preferred labels for 48 languages
Coreference links to 6 other datasets
(e.g. Freebase, Wikidata)
Inter-linking information
Europeana EssentialsCC BY-SACC BY-SA
Linked Data for European Cultural Heritage: the Europeana approach
Next steps • Generate Europeana URIs for Entities• Make entity services and data available via an API, and
further integrate existing components• Integrate vocabularies that can further enrich the
Collection• Integrate alignments
• particularly, links between local/domain vocabularies to pivot vocabularies
Building a framework for semantic cultural heritage dataEuropeana EssentialsCC BY-SACC BY-SA
Linked Data for European Cultural Heritage: the Europeana approach
28 April 2016
With input slides from Antoine Isaac & Hugo Manguinhas,Europeana R&D team