Upload
mariana-damova
View
1.281
Download
2
Tags:
Embed Size (px)
DESCRIPTION
Presentation at the Ontology Matching Workshop held at ISWC\'2010
Citation preview
Mapping the Central LOD
Ontologies to PROTON Upper-
Level Ontology
Mariana Damova, Atanas Kiryakov, Kiril
Simov, Svetoslav Petrov
ISWC’2010
Outline
• Introduction
• Problem and Conceptual Solution
• Approaches to Matching Ontologies
• Mapping Methods• Mapping Methods
• Proton Extension
• Statistics
• Experimentation
• Future Work
Linking Open Data (LOD)
FactForge (http://factforge.net)
• a reason-able view of the web of data
• the biggest and most heterogeneous body of factual knowledge on
which inference is performed
• 8 datasets from the LOD cloud (DBPedia, Freebase, UMBEL, CIA
World Factbook, MusicBrainz, Wordnet, Lingvoj, Geonames)
• an overall of 1,4 billion loaded statements
• 2,2 billion stored statements (indexed)
• 9,8 billion distinct retrievable statements
Access to FactForge
Forest Interface
- keyword search (for molecules in the RDF graph)
- auto-suggestion of URI
- SPARQL queries using facts from different datasets
(formulating SPARQL queries requires in depth knowledge about the datasets and the
schemata in FactForge)
- exploration
Outline
• Introduction
• Problem and Conceptual Solution
• Approaches to Matching Ontologies
• Mapping Methods• Mapping Methods
• Proton Extension
• Statistics
• Experimentation
• Future Work
Current State
Target State
SELECT * WHERE {
?Person dbp-ont:birthPlace ?BirthPlace ;
rdf:type opencyc:Entertainer ;
?BirthPlace geo-ont:parentFeature dbpedia:Germany .
}
Target State
SELECT * WHERE {
?Person prot:birthPlace ?BirthPlace ;
rdf:type prot:Entertainer ;
?BirthPlace prot:subRegionOf dbpedia:Germany .
}
Benefit
easier and simpler access to the wealth of data (one needs to know a single ontology instead of learning the
vocabularies of multiple datasets)
higher degree of interoperability offering only
one single schema
better integration of the datasets in FactForge
(so the schemata of the different datasets are mapped through
a single ontology)
information from many datasets via a unified
vocabulary
(a constraint which uses a single predicate from the ontology
can match data from different datasets)
Applications
semantic search and annotation using
the entities from FactForge
semantic browsing and navigationsemantic browsing and navigation
querying FactForge in natural language
? many others ?
PROTON - a modular, lightweight, upper-level ontology defining
about 300 classes and 100 properties
Solution
PROTON
DBPedia Freebase Geonames
about 300 classes and 100 properties
DBPedia - the RDFized version of Wikipedia, data-driven ontology,
an overall of 592 classes and 720 properties, and about 1 478 000
instances
Freebase - a social database, no ontology, hidden class hierarchy
in structured predicate names, 19632 predicates
Geonames – a geographical database that covers 6 million
geographical features with ontology of 5 classes, and 26
properties, and about 645 location denotators
Outline
• Introduction
• Problem and Conceptual Solution
• Approaches to Matching Ontologies
• Mapping Methods• Mapping Methods
• Proton Extension
• Statistics
• Experimentation
• Future Work
Approaches to Matching Ontologies
• syntactic vs. semantic mapping
• harmonizing semantics (interlingua ontology)
• bidirectional vs. unidirectional
• automated vs. manual• automated vs. manual
• OAEI - ontology matching competitions
– benchmark with best F-measure result of 80%
2009
Ontological Heterogeneity
• Semantic matching approaches cannot cope with
ontological heterogeneity
• The classes and the properties may be described in
different unrelated ontologies
• The algorithms cannot discover hidden relationships • The algorithms cannot discover hidden relationships
that hold between unrelated entities.
FactForge presents exactly such a reality of
heterogeneous facts. This makes their automated
processing inconvenient.
Our Approach
unidirectional semantic manual alignment between
PROTON PROTON
and
the schema ontologies of the selected datasets of
FactForge
Outline
• Introduction
• Problem and Conceptual Solution
• Approaches to Matching Ontologies
• Mapping Methods• Mapping Methods
• Proton Extension
• Statistics
• Experimentation
• Future Work
Mapping Methods
• A series of iterations of enrichments at conceptual and
at data levels
(b) Extending PROTON with classes and
properties
(a) Subsumption relations for classes
and properties
FactForge entity PROTON entitysubClassOf
(c) Using OWL class and property construction capabilities to
represent classes and properties from FactForge and map
them to PROTON classes
Class restricted from
FactForge predicatesPROTON entity
subClassOf
(d) Extending FactForge with instances to account for the
conceptual representation of the matching
FactForge Instance 1
FactForge Instance 2 (new)
FactForge Instance 3 (new)
Mapping rules
(a) dbp:Place
owl:subClassOf proton:Location .
(b) dbp-prop:location
rdfs:subPropertyOf proton:locatedIn .
(c) pfb:Location (c) pfb:Location
rdf:type owl:Restriction ;
owl:onProperty <http://rdf.freebase.com/ns/type.object.type> ;
owl:hasValue <http://rdf.freebase.com/ns/location.location> ;
rdfs:subClassOf proton:Location .
(d) p <rdf:type> <dbp-ont:PrimeMinister>
----------------------------------------------------
p <proton:hasPosition> j
j <proton:hasTitle> <proton:PrimeMinister>
Conceptual mismatches
dbp:Architect
rdfs:subClassOf
[ rdf:type owl:Restriction ;
owl:onProperty
proton:hasProfession ;
owl:hasValue proton:Architect
] .
Expression matching
Multiple Matching
DBPedia predicate
DBPedia ontology
predicate
PROTON predicate
dbp:place
rdfs:subPropertyOf proton:locatedIn .
dbp-prop:location
rdfs:subPropertyOf proton:locatedIn .
<http://rdf.freebase.com/ns/time.event.locations>
rdfs:subPropertyOf proton:locatedIn .
Freebase predicate
Outline
• Introduction
• Problem and Conceptual Solution
• Approaches to Matching Ontologies
• Mapping Methods• Mapping Methods
• Proton Extension
• Statistics
• Experimentation
• Future Work
PROTON Extension
• Preserve the OntoClean approach in PROTON design
• Obtain coverage of the rich data in FactForge
• Keep an optimal degree of granularity of the concept hierarchy
Proton was split into modules :
- 19 modules reflecting the conceptual divisions which surfaced
during the analysis of the data, e.g. proton event, proton social
abstraction, proton location, proton biological substance, etc.
Outline
• Introduction
• Problem and Conceptual Solution
• Approaches to Matching Ontologies
• Mapping Methods• Mapping Methods
• Proton Extension
• Statistics
• Experimentation
• Future Work
Statistics
• PROTON Extension– 141 new Classes
– 3 new Datatype properties
– 12 new Object properties
• Mapping PROTON to FactForge
– 536 subClassOf relations
– 36 subPropertyOf relations
Outline
• Introduction
• Problem and Conceptual Solution
• Approaches to Matching Ontologies
• Mapping Methods• Mapping Methods
• Proton Extension
• Statistics
• Experimentation
• Future Work
Experiment
Data Loading
BigOWLIM – the most scalable OWL Engine
http://www.ontotext.com/owlim/
FactForge Standard
June 2010
FactForge with
mappings June
2010
NumberOfStatements 1,782,541,506 2,630,453,334
NumberOfExplicitStatements 1,143,317,531 1,942,349,578
NumberOfEntities 354,635,159 404,798,593
FactForge Extension Statistics
FactForge FactForge with
Inference Rules
FactForge with IR
and PROTON
FactForge with IR,
PROTON and DBP
mappings
FactForge with
IR, PROTON,
DBP mappings
and Freebase
mappings
NumberOfStatements
October 2010
2,237,550,385 2,237,550,617 2,237,578,643 2,255,543,166 2,375,287,183NumberOfExplicitStatements
1,357,013,227 2,027,992,627 2,027,995,363 2,027,995,651 2,027,995,750NumberOfEntities
524,120,454 524,120,465 524,121,955 524,121,996 524,122,009
FactForge Extension Statistics
Difference
between FactForge and Each Adding Iteration
FactForgeFactForge and
FactForge with
FactForge and
FactForge with
factForge and
FactForge with
factForge and
FactForge with
October 2010
FactForge with
Inference Rules
FactForge with
IR and
PROTON
FactForge with
IR, PROTON
and DBP
mappings
FactForge with
IR, PROTON,
DBP mappings
and Freebase
mappings
NumberOfStatements
0 232 28,258 17,992,781 137,736,798
NumberOfExplicitStatem
ents 0 670,979,400 670,982,136 670,982,424 670,982,523
NumberOfEntities
0 11 1,501 1,542 1,555
FactForge Extension Statistics
Difference
between Each Adding Iteration
FactForge FactForge and
FactForge with
Inference Rules
FactForge with
Inference Rules
and FactForge
with IR and
FactForge with IR
and PROTON and
FactForge with
IR, PROTON and
FactForge with
IR, PROTON and
DBP mappings
and FactForge
October 2010
with IR and
PROTON
IR, PROTON and
DBP mappings
and FactForge
with IR, PROTON,
DBP mappings
and Freebase
mappings
NumberOfStatements0 232 28,026 17,964,523 119,744,017
NumberOfExplicitStatements0 670,979,400 2,736 288 99
NumberOfEntities0 11 1490 41 13
Experimentation: PROTON query
US non-profit organizations founded after 1950
PREFIX p-ext: <http://proton.semanticweb.org/protonue#>
PREFIX ptop: <http://proton.semanticweb.org/protont#>
PREFIX dbpedia: <http://dbpedia.org/resource/>
SELECT distinct ?s ?date ?l WHERE {
?s a p-ext:Non-ProfitOrganisation .
?s ptop:establishmentDate ?date .
?s ptop:locatedIn ?l .
?l ptop:subRegionOf dbpedia:United_States .
FILTER (?date > "1950")
}
Query: Prime Ministers born in the United Kingdom
PREFIX dbp-ont: <http://dbpedia.org/ontology/>
PREFIX dbpedia: <http://dbpedia.org/resource/>
SELECT ?PrimeMinister WHERE {
?PrimeMinister rdf:type dbp-ont:PrimeMinister .
?PrimeMinister dbp-ont:birthPlace dbpedia:United_Kingdom .
}
PREFIX pupp: <http://proton.semanticweb.org/protonu#>
PREFIX p-ext: <http://proton.semanticweb.org/protonue#>
PREFIX dbpedia: <http://dbpedia.org/resource/>
PREFIX ptop: <http://proton.semanticweb.org/protont#>
SELECT ?PrimeMinister WHERE {
?PrimeMinister ptop:hasPosition ?pos .
?pos pupp:hasTitle dbpedia:British_prime_minister .
?PrimeMinister p-ext:birthPlace dbpedia:United_Kingdom .
}
Query: Cities around the world which have Modigliani art work
PREFIX fb: <http://rdf.freebase.com/ns/>
PREFIX dbpedia: <http://dbpedia.org/resource/>
PREFIX dbp-prop: <http://dbpedia.org/property/>
PREFIX dbp-ont: <http://dbpedia.org/ontology/>
PREFIX umbel-sc: <http://umbel.org/umbel/sc/>
PREFIX rdf: <http://www.w3.org/1999/02/22-rdf-syntax-ns#>
PREFIX ot: <http://www.ontotext.com/>
SELECT DISTINCT ?painting_l ?owner_l ?city_fb_con ?city_db_loc ?city_db_cit
WHERE {WHERE {
?p fb:visual_art.artwork.artist
dbpedia:Amedeo_Modigliani ;
fb:visual_art.artwork.owners [ fb:visual_art.artwork_owner_relationship.owner ?ow ] ;
ot:preferredLabel ?painting_l.
?ow ot:preferredLabel ?owner_l .
OPTIONAL { ?ow fb:location.location.containedby [ ot:preferredLabel ?city_fb_con ] }
OPTIONAL { ?ow dbp-prop:location ?loc.
?loc rdf:type umbel-sc:City ;
ot:preferredLabel ?city_db_loc }
OPTIONAL { ?ow dbp-ont:city [ ot:preferredLabel ?city_db_cit ] }
}
Query: Cities around the world which have Modigliani art work
PREFIX dbpedia: <http://dbpedia.org/resource/>
PREFIX rdf: <http://www.w3.org/1999/02/22-rdf-syntax-ns#>
PREFIX ot: <http://www.ontotext.com/>
PREFIX ptop: <http://proton.semanticweb.org/protont#>
PREFIX p-ext: <http://proton.semanticweb.org/protonue#>
SELECT DISTINCT ?painting ?owner ?city
WHERE {
?p p-ext:author dbpedia:Amedeo_Modigliani ;
p-ext:ownership [ ptop:isOwnedBy ?ow ] ;
ot:preferredLabel ?painting .
?ow ot:preferredLabel ?owner .
?ow ptop:locatedIn [ rdf:type pupp:City ;
ot:preferredLabel ?city].
}
Outline
• Introduction
• Problem
• Conceptual Solution
• Approaches to Matching Ontologies• Approaches to Matching Ontologies
• Mapping Methods
• Proton Extension
• Statistics
• Experimentation
• Future Work
Future Work
• Two-level intermediary layer to access FactForge
mapping PROTON to UMBEL
Dataset1 Dataset2 Dataset3 Dataset4Datasets
Upper Level Ontology 2Two level
Ontology 1 Ontology 2 Ontology 3 Ontology 4
Upper Mapping and Binding Exchange LayerA lightweight, subject concept reference structure for the Web
Upper Mapping and Binding Exchange LayerA lightweight, subject concept reference structure for the Web
Query
triples
?S P O
?S P1 ?O1
?O1 P2 O2
Upper Level Ontology 2Two level
Intermediate
layer Upper Level Ontology 1
http://www.ontotext.com/news.html#umb_25oct10
UMBEL - Upper Mapping and Binding Exchange Layer
A lightweight, subject concept reference structure for the Web
20 000 concepts mapped to OpenCyc
Strategic partnership Ontotext – Structured Dynamics
Future Work
• Official FactForge release with the presented
mapping
• Publish Proton and mapping as LOD
• Cover more datasets from the LOD cloud• Cover more datasets from the LOD cloud
• Experiment with the balance between the
datasets and the ontologies describing them
• Extend the property mapping
http://factforge.net
(a version with Proton mapping is currently available at
http://ldsr4.ontotext.com)
Service available at
Thank you for your attention!
Questions?
Contact: