32
LOD2 Webinar . 29.11.2011 . Page 1 http://lod2.eu Creating Knowledge out of Interlinked Data

LOD2 Webinar Series Classification and Quality Analysis with DL Learner and ORE

Embed Size (px)

DESCRIPTION

In this Webinar Lorenz Bühmann presents the ontology repair and enrichment tool ORE and also the DL-Learner , a machine learning tool to solve supervised learnings tasks and support knowledge engineers in constructing knowledge. Those two beneighbored tools in the LOD2 Stack are for classification and the following quality analysis of Linked Data.

Citation preview

Page 1: LOD2 Webinar Series Classification and Quality Analysis with DL Learner and ORE

LOD2 Webinar . 29.11.2011 . Page 1 http://lod2.eu

Creating Knowledge out of Interlinked Data

Page 2: LOD2 Webinar Series Classification and Quality Analysis with DL Learner and ORE

LOD2 Webinar . 29.11.2011 . Page 2 http://lod2.eu

Creating Knowledge out of Interlinked Data

http://lod2.eu

LOD2 is a large-scale integrating project co-funded by the European

Commission within the FP7 Information and Communication Technologies

Work Programme. This 4-year project comprises leading Linked Open

Data technology researchers, companies, and service providers. Coming

from across 12 countries the partners are coordinated by the Agile

Knowledge Engineering and Semantic Web Research Group at the

University of Leipzig, Germany.

LOD2 will integrate and syndicate Linked Data with existing large-scale

applications. The project shows the benefits in the scenarios of Media and

Publishing, Corporate Data intranets and eGovernment.

Page 3: LOD2 Webinar Series Classification and Quality Analysis with DL Learner and ORE

LOD2 Webinar . 29.11.2011 . Page 3 http://lod2.eu

Creating Knowledge out of Interlinked Data

http://lod2.eu

Once per month the LOD2 webinar series offer a free webinar about tools and services along the Linked Open Data Life Cycle. Stay with us and learn more about acquisition, editing, composing, connected applications – and finally publishing Linked Open Data.

Page 4: LOD2 Webinar Series Classification and Quality Analysis with DL Learner and ORE

LOD2 Webinar . 29.11.2011 . Page 4 http://lod2.eu

Creating Knowledge out of Interlinked Data

• Member of Agile Knowledge Engineering and Semantic Web research group

(AKSW) since 2009

• PhD Student at the University of Leipzig since 2011

• Research Interests:

– Machine Learning in Semantic Web

– Ontology Debugging

About me Lorenz Bühmann

Page 5: LOD2 Webinar Series Classification and Quality Analysis with DL Learner and ORE

LOD2 Webinar . 29.11.2011 . Page 5 http://lod2.eu

Creating Knowledge out of Interlinked Data

• Founded in 2006

• 30+ Researchers

• 3 Sub groups

• Goals

– Contributing to the advancement of science in Semantic Web, Knowledge Engineering,

Software Engineering

– Cost efficient, high-impact R&D, which proves usefulness at an early stage

– Bridge the gap between research results and applications

• Committed to Open Source, Open Access, and Open Knowledge movements

Agile Knowledge Engineering and Semantic Web Research Group

Page 6: LOD2 Webinar Series Classification and Quality Analysis with DL Learner and ORE

LOD2 Webinar . 29.11.2011 . Page 6 http://lod2.eu

Creating Knowledge out of Interlinked Data

EU Funded Projects:

– Linked Open Data 2 (LOD2)

– GeoKnow

– BIG

– BioASQ

– LinkingLOD

– Open Data Portal (ODP)

• Past EU funded Projects

– LOD Around the Clock (LATC)

– Semantic Content Management Systems for Enterprise Knowledge

Management and News Mining (SCMS)

– OntoWiki - Semantic Collaboration for Knowledge Management, E-

Learning and E-Tourism

– ...

Agile Knowledge Engineering and Semantic Web Research Group

Further descriptions about

projects can be found here:

http://aksw.org/Projects

Page 7: LOD2 Webinar Series Classification and Quality Analysis with DL Learner and ORE

LOD2 Webinar . 29.11.2011 . Page 7 http://lod2.eu

Creating Knowledge out of Interlinked Data

Project Page: http://aksw.org/Projects/ORE

Source Code: https://github.com/AKSW/ORE

Demo Page: http://ore.aksw.org/

Page 8: LOD2 Webinar Series Classification and Quality Analysis with DL Learner and ORE

LOD2 Webinar . 28.11.2013 . Page 8 http://lod2.eu

Creating Knowledge out of Interlinked Data

Agenda

Introduction and Motivation

Linked Data life cycle and role of ORE within LOD2

Knowledge Base Enrichment

Knowledge Base Repair

Demo

Future Work

Page 9: LOD2 Webinar Series Classification and Quality Analysis with DL Learner and ORE

LOD2 Webinar . 28.11.2013 . Page 9 http://lod2.eu

Creating Knowledge out of Interlinked Data

Introduction

• the quantity and size of RDF knowledge bases has significantly

increased

• many of those knowledge bases lack sophisticated schemata and

instance data adhering to those schemata

• for content extracted from legacy sources, crowdsourced content, but

also manually curated content, it is challenging to ensure a co-

evolution of schemata and data, in particular for large knowledge

bases

• ontology modeling can be difficult and introduce unintended errors

Page 10: LOD2 Webinar Series Classification and Quality Analysis with DL Learner and ORE

LOD2 Webinar . 28.11.2013 . Page 10 http://lod2.eu

Creating Knowledge out of Interlinked Data

Linked Data Life Cycle

Classification Enrichment

Quality Analysis

Evolution Repair

Search

Browsing

Exploration

Extraction

Storage Querying

Manual revision

authoring

Interlinking Fusing

Page 11: LOD2 Webinar Series Classification and Quality Analysis with DL Learner and ORE

LOD2 Webinar . 28.11.2013 . Page 11 http://lod2.eu

Creating Knowledge out of Interlinked Data

Knowledge Base Schema Enrichment

Goal: Learn schema axioms in knowledge bases based on the instance data.

Apply methods of Inductive Logic Programming(ILP):

Positive examples + Negative examples + Background knowledge

Hypothesis

ILP in Semantic Web:

OWL Axiom

Page 12: LOD2 Webinar Series Classification and Quality Analysis with DL Learner and ORE

LOD2 Webinar . 28.11.2013 . Page 12 http://lod2.eu

Creating Knowledge out of Interlinked Data

Knowledge Base Schema Enrichment

“How to describe a class MusicalArtist?”

Positive Examples: Eric Clapton a Person; instrument Guitar; birthDate 1945-03-30.

Elton John a Person; instrument Piano; birthDate 1947-03-25.

Whitney Houston a Actor; instrument Vocals; birthDate 1963-08-09.

Background Knowledge: Actor SubClassOf Person.

Learned Axiom: MusicalArtist EquivalentTo Person and

birthDate some xsd:date and

instrument some Instrument

We can derive from Tina Turner a Person; instrument Vocals; birthDate 1939-11-26.

Tina Turner a MusicalArtist.

Page 13: LOD2 Webinar Series Classification and Quality Analysis with DL Learner and ORE

LOD2 Webinar . 28.11.2013 . Page 13 http://lod2.eu

Creating Knowledge out of Interlinked Data

Knowledge Base Schema Enrichment

“How to describe a property birthDate?”

Positive Examples: Eric Clapton birthDate 1945-03-30.

Elton John birthDate 1947-03-25.

Whitney Houston birthDate 1963-08-09.

Paracelsus birthDate 1493-11-11, 1493-12-17.

Learned Axiom: Functional(birthDate) (3 out of 4 = 75%)

Page 14: LOD2 Webinar Series Classification and Quality Analysis with DL Learner and ORE

LOD2 Webinar . 28.11.2013 . Page 14 http://lod2.eu

Creating Knowledge out of Interlinked Data

Knowledge Base Schema Enrichment

Why schema enrichment?

1. The axioms serve as documentation for the purpose and correct

usage of schema elements.

2. They improve the application debugging techniques.

3. Additional implicit information can be inferred.

Page 15: LOD2 Webinar Series Classification and Quality Analysis with DL Learner and ORE

LOD2 Webinar . 28.11.2013 . Page 17 http://lod2.eu

Creating Knowledge out of Interlinked Data

Knowledge Base Schema Enrichment in ORE

Page 16: LOD2 Webinar Series Classification and Quality Analysis with DL Learner and ORE

LOD2 Webinar . 28.11.2013 . Page 18 http://lod2.eu

Creating Knowledge out of Interlinked Data

Knowledge Base Repair

Goal: Support in detection and repair of modelling errors and problems.

ORE in its current version supports the detection and repair of

• logical errors

• naming problems by integrating the PatOMat framework

• “missing data”

Page 17: LOD2 Webinar Series Classification and Quality Analysis with DL Learner and ORE

LOD2 Webinar . 28.11.2013 . Page 19 http://lod2.eu

Creating Knowledge out of Interlinked Data

Knowledge Base Repair of Logical Errors

OWL based on Description Logics

Common errors:

• unsatisfiable classes cannot have any individuals

• inconsistent ontology everything can be derived

Both cases are quite easy for a reasoner to detect and for a tool to

display when working on OWL ontologies.

“Why is the class unsatisfiable?”

“Why is the knowledge base inconsistent?”

Explanations

Page 18: LOD2 Webinar Series Classification and Quality Analysis with DL Learner and ORE

LOD2 Webinar . 28.11.2013 . Page 20 http://lod2.eu

Creating Knowledge out of Interlinked Data

Knowledge Base Repair of Logical Errors in ORE

Page 19: LOD2 Webinar Series Classification and Quality Analysis with DL Learner and ORE

LOD2 Webinar . 28.11.2013 . Page 21 http://lod2.eu

Creating Knowledge out of Interlinked Data

Knowledge Base Repair of Naming Problems

• analysis of the naming of entities across ontological structures

can reveal

• naming issues

• underlying conceptualization issues

• natural language structure influences the structure of formal

knowledge bases and vice versa

• content expressed in formal representation languages, such as

the semantic web ones, should be accessible not only to logical

reasoning machines but also to humans and NLP procedures,

and thus resemble the natural language as much as possible

Naming in ontologies matters

Page 20: LOD2 Webinar Series Classification and Quality Analysis with DL Learner and ORE

LOD2 Webinar . 28.11.2013 . Page 22 http://lod2.eu

Creating Knowledge out of Interlinked Data

Knowledge Base Repair of Naming Problems

Page 21: LOD2 Webinar Series Classification and Quality Analysis with DL Learner and ORE

LOD2 Webinar . 28.11.2013 . Page 23 http://lod2.eu

Creating Knowledge out of Interlinked Data

Knowledge Base Repair of Naming Problems

Paper

Accepted Rejected

Paper

AcceptedPaper RejectedPaper

Page 22: LOD2 Webinar Series Classification and Quality Analysis with DL Learner and ORE

LOD2 Webinar . 28.11.2013 . Page 24 http://lod2.eu

Creating Knowledge out of Interlinked Data

Knowledge Base Repair of Naming Problems in ORE

Page 23: LOD2 Webinar Series Classification and Quality Analysis with DL Learner and ORE

LOD2 Webinar . 28.11.2013 . Page 25 http://lod2.eu

Creating Knowledge out of Interlinked Data

Knowledge Base Repair by Validation of Constraints

Constraints = OWL Axioms

Example:

Person SubClassOf birthDate some xsd:date

“Every person has a birth date”

OWL comes with the Open World Assumption(OWA), thus missing

information is no error.

We assume a Closed World Assumption, such that we can reuse

OWL axioms as constraints.

Basic Idea: Rewrite OWL axiom into SPARQL query

Page 24: LOD2 Webinar Series Classification and Quality Analysis with DL Learner and ORE

LOD2 Webinar . 28.11.2013 . Page 26 http://lod2.eu

Creating Knowledge out of Interlinked Data

Knowledge Base Repair by Validation of Constraints

Person SubClassOf birthDate some xsd:date

SELECT ?s WHERE { ?s a :Person. FILTER NOT EXISTS { ?s :birthDate ?o. FILTER(DATATYPE(?o)=xsd:date) } }

Page 25: LOD2 Webinar Series Classification and Quality Analysis with DL Learner and ORE

LOD2 Webinar . 28.11.2013 . Page 27 http://lod2.eu

Creating Knowledge out of Interlinked Data

Knowledge Base Repair by Validation of Constraints in ORE

Page 26: LOD2 Webinar Series Classification and Quality Analysis with DL Learner and ORE

LOD2 Webinar . 28.11.2013 . Page 28 http://lod2.eu

Creating Knowledge out of Interlinked Data

Demonstration http://ore.aksw.org

Page 27: LOD2 Webinar Series Classification and Quality Analysis with DL Learner and ORE

LOD2 Webinar . 28.11.2013 . Page 29 http://lod2.eu

Creating Knowledge out of Interlinked Data

Future Work

Enrichment batch mode

Logical repair detection of unsatisfiable classes on SPARQL knowledge bases

Constraint Validation add repair option SPIN support (better) explanation why resources violate a constraint + support

to add missing information

Page 28: LOD2 Webinar Series Classification and Quality Analysis with DL Learner and ORE

LOD2 Webinar . 28.11.2013 . Page 30 http://lod2.eu

Creating Knowledge out of Interlinked Data

Further Information

License: Apache License Version 2.0

Project page: http://aksw.org/Projects/ORE Source code: https://github.com/AKSW/ORE DL-Learner: http://dl-learner.org/ PatOMat: http://patomat.vse.cz/

Page 29: LOD2 Webinar Series Classification and Quality Analysis with DL Learner and ORE

LOD2 Webinar . 28.11.2013 . Page 31 http://lod2.eu

Creating Knowledge out of Interlinked Data

Lorenz Bühmann

Address Work: Augustusplatz 10, Room P618, 04109 Leipzig

Email: [email protected]

Skype ID: lorenz.buehmann.uni

Contact Details

Page 30: LOD2 Webinar Series Classification and Quality Analysis with DL Learner and ORE

LOD2 Webinar . 29.11.2011 . Page 32 http://lod2.eu

Creating Knowledge out of Interlinked Data

Credits

Jingle R.E.M., Martin Kaltenböck, Florian Kondert

Coordination Thomas Thurner

Martin Kaltenböck

Moderation Martin Kaltenböck

Presented by Lorenz Bühmann

Page 31: LOD2 Webinar Series Classification and Quality Analysis with DL Learner and ORE

LOD2 Webinar . 29.11.2011 . Page 33 http://lod2.eu

Creating Knowledge out of Interlinked Data

http://lod2.eu

Hope you enjoyed staying with us – if you need more detailed information, visit us at www.lod2.eu and let us know how we can improve to meet your expectations! Don’t forget to register for our next webinar 20.12. 2011 - Virtuoso (Open Link Software) 24.01. 2012 - OntoWiki (University of Leipzig, Germany) Have a great day and don’t forget ...

Page 32: LOD2 Webinar Series Classification and Quality Analysis with DL Learner and ORE

LOD2 Webinar . 29.11.2011 . Page 34 http://lod2.eu

Creating Knowledge out of Interlinked Data

http://lod2.eu