23
co-funded by the European Union Contextualisation Dominique Ritze

DM2E Project meeting Bergen: WP2: Contextualization, Dominique Ritze (University of Mannheim)

Embed Size (px)

DESCRIPTION

 

Citation preview

Page 1: DM2E Project meeting Bergen: WP2: Contextualization, Dominique Ritze (University of Mannheim)

co-funded by the European Union

Contextualisation

Dominique Ritze

Page 2: DM2E Project meeting Bergen: WP2: Contextualization, Dominique Ritze (University of Mannheim)

Motivation

218.06.2014

Who is George Grote?

Which resources sharethe same subjects?

Vorführender
Präsentationsnotizen
GND (gemeinsame Normdatei – universal authority file, published by the german national library) Persons, subject headings, corporate bodies
Page 3: DM2E Project meeting Bergen: WP2: Contextualization, Dominique Ritze (University of Mannheim)

Example

318.06.2014

Work: Der zerbrochene Krug

Vorführender
Präsentationsnotizen
GND (gemeinsame Normdatei – universal authority file, published by the german national library) Persons, subject headings, corporate bodies
Page 4: DM2E Project meeting Bergen: WP2: Contextualization, Dominique Ritze (University of Mannheim)

Example

418.06.2014

Author: Ludwig Wittgenstein

Vorführender
Präsentationsnotizen
Virtual international authority file link the national authority files
Page 5: DM2E Project meeting Bergen: WP2: Contextualization, Dominique Ritze (University of Mannheim)

Example

518.06.2014

Owner: Prinz Eugen von Savoyen

Vorführender
Präsentationsnotizen
Dbpedia/wikipedia
Page 6: DM2E Project meeting Bergen: WP2: Contextualization, Dominique Ritze (University of Mannheim)

Example

618.06.2014

Subject: Adminstration

Page 7: DM2E Project meeting Bergen: WP2: Contextualization, Dominique Ritze (University of Mannheim)

Example

718.06.2014

Place: Berlin

Page 8: DM2E Project meeting Bergen: WP2: Contextualization, Dominique Ritze (University of Mannheim)

Overview

• Silk as Contextualisation Tool• System Integration• Contextualisation Progress and Results• Challenges• Applicability and Reuseability• Future Plans

818.06.2014

Page 9: DM2E Project meeting Bergen: WP2: Contextualization, Dominique Ritze (University of Mannheim)

Contextualisation with Silk

• Silk: Link Discovery Framework (UMA)• Definition of linkage rules to create links between Linked

Data resources

• http://context.dm2e.eu918.06.2014

Vorführender
Präsentationsnotizen
Structured information
Page 10: DM2E Project meeting Bergen: WP2: Contextualization, Dominique Ritze (University of Mannheim)

Intergration of Silk

• Silk is integrated in OmNom as Web Service

1018.06.2014

use generatedconfiguration

show links

Page 11: DM2E Project meeting Bergen: WP2: Contextualization, Dominique Ritze (University of Mannheim)

Access to Contextualisation Results

• Contextualization results (Linksets) are kept separate from ingested data

• Linksets are further described and versioned

• Additional linkset properties (tbd):– Automatically created– Manually created– Recall-oriented (exploratory, but with wrong links)– Precision-oriented (incomplete, but high quality)

1118.06.2014

Page 12: DM2E Project meeting Bergen: WP2: Contextualization, Dominique Ritze (University of Mannheim)

Used Linked Data Resources

1218.06.2014

Geonames GNDLCSHDBPedia

Freebase

Places Subjects

Agents

DDCLinked

Geodata

Page 13: DM2E Project meeting Bergen: WP2: Contextualization, Dominique Ritze (University of Mannheim)

Example Process

1318.06.2014

• Manual creation of linkage rules, e.g. compareskos:prefLabel with rdfs:label using Levenstheindistance, link if distance < 2

• Let Silk run to find the links

Page 14: DM2E Project meeting Bergen: WP2: Contextualization, Dominique Ritze (University of Mannheim)

Results

• Contextualised all datasets that are currently ingested-> no qualitative analysis so far

• increased the number of existing links by 20% (performance requirement)

• Different amounts of links were found– Dingler (UBER) 134 unique links– Deutsches Textarchiv (BBAW) 9946 unique links

• Potential to find more links1418.06.2014

Page 15: DM2E Project meeting Bergen: WP2: Contextualization, Dominique Ritze (University of Mannheim)

Links in Pubby

1518.06.2014

Vorführender
Präsentationsnotizen
GND (gemeinsame Normdatei – universal authority file, published by the german national library) Persons, subject headings, corporate bodies
Page 16: DM2E Project meeting Bergen: WP2: Contextualization, Dominique Ritze (University of Mannheim)

Links to DBPedia

1618.06.2014

Page 17: DM2E Project meeting Bergen: WP2: Contextualization, Dominique Ritze (University of Mannheim)

Links to GeoNames

1718.06.2014

Page 18: DM2E Project meeting Bergen: WP2: Contextualization, Dominique Ritze (University of Mannheim)

Links in Pubby

1818.06.2014

Vorführender
Präsentationsnotizen
GND (gemeinsame Normdatei – universal authority file, published by the german national library) Persons, subject headings, corporate bodies
Page 19: DM2E Project meeting Bergen: WP2: Contextualization, Dominique Ritze (University of Mannheim)

Challenges

• In most cases, only a prefered label is available– Nancy France vs. Nancy Kentucky

• Very specific rules for different spellings/abbreviationsrequired– Frankfurt am Main vs. Frankfurt a.M. vs. Frankfurt a/M

• Unstructured data is not captured

1918.06.2014

Page 20: DM2E Project meeting Bergen: WP2: Contextualization, Dominique Ritze (University of Mannheim)

• Place: Wren Library, Trinity College Cambridge

• Agent: Georg Tanner, Maximilian II.

Unstructured Data

2018.06.2014

Page 21: DM2E Project meeting Bergen: WP2: Contextualization, Dominique Ritze (University of Mannheim)

Results unstructured data

2118.06.2014

• Codices provenance

• WAB description

Vorführender
Präsentationsnotizen
Most information is not included in other data fields
Page 22: DM2E Project meeting Bergen: WP2: Contextualization, Dominique Ritze (University of Mannheim)

Applicability and Reuseability

• Created linkage rules can be reused but an adaptionmight be necessary

• Knowledge about the Silk framework and the similarityfunctions is required

• Access to the datasets is required (as dump or in a triplestore)

• Quality of the links is not ensured

2218.06.2014

Page 23: DM2E Project meeting Bergen: WP2: Contextualization, Dominique Ritze (University of Mannheim)

Future Work

• Evaluation of the detected links– Iterative process to improve the links

• Can we use existing information, e.g. already knownconnections to strenghen/weaken links?

• Questions that can be answered based on the links?– Where have the resources been published?– MarineLinves – Map of the ship routes

2318.06.2014