12th of October, 2006 KEG seminar 1
Combining Ontology Mapping Methods Using Bayesian
NetworksOntology Alignment
Evaluation Initiative 2006 - 'Conference' Track
Ondřej ŠvábVojtěch Svátek
12th of October, 2006 KEG seminar 2
Overview
• Ontology Mapping
• Combining Ontology Mapping Methods
• Using Bayesian Networks– String distance metrics– Mapping patterns
• OAEI – Our track – conference domain– Evaluation
12th of October, 2006 KEG seminar 3
Ontology Mapping
Ontology Mapping = discovering of Semantic correspondencies (equivalence, subsumption)
12th of October, 2006 KEG seminar 4
Classification of ontology mapping techniques
12th of October, 2006 KEG seminar 5
Modelling of interdependencies (1)
• Using Bayesian Networks• String distance metrics from SecondString
library (mapping methods)• Training data, pairs of concepts from ontologies
ekaw.owl a confOf.owl from OntoFarm collection– 798 pairs
• Bayesian network– nodes: mapping justification by each mapping method– Classification node: „align“ (true, false)
12th of October, 2006 KEG seminar 6
Modelling of interdependencies (2)
• Two tested Bayesian Networks (two corresponding classifiers)– Naive Bayesian Structure
• Probability distributions learned from data
– Learned Bayesian Structure • Learned both CPT and structure
12th of October, 2006 KEG seminar 7
Evaluation of models
• One-leave-out method (798x)
• Evaluation: precision, recall
• Precision more important than recall– 3:2 (precision weight 0,6), 4:1 (0,8)– C = P*a + R*b, kde a, b jsou váhy– higher C, better classifier
12th of October, 2006 KEG seminar 8
73% precision, 60% recall, 88% accuracy at 80% threshold
12th of October, 2006 KEG seminar 9
84% precision, 53% recall, 89% accuracy at 60% threshold
Align ci. CharJaccard, Monge-Elkan, Levenshtein | TFIDF, SmithWaterman, Jaccard, Jaro, SLIM
12th of October, 2006 KEG seminar 10
Evaluation (c = P*a + R*b)
Naive bayes
Jaccard
BN 2
12th of October, 2006 KEG seminar 11
Mapping patterns (1)
• Capturing structures using mapping patterns
• Mapping pattern between ontologies
12th of October, 2006 KEG seminar 12
Mapping patterns (2)
Mapping pattern
Part of Bayesian Network
12th of October, 2006 KEG seminar 13
Conclusions & Future works
• Combination of string-based methods is not promising• Implementation of low-level „string based justifications“
of mapping – suffix, prefix, identical names• Capturing context – Employ methods working with
structures of ontologies (graph-based), mapping patterns• Not only equivalence relations, but also discovery
subsumption relations – using linguistic sources, like WordNet
12th of October, 2006 KEG seminar 14
Ontology Alignment Evaluation Initiative 2006 -
'Conference' Track
12th of October, 2006 KEG seminar 15
OAEI 2006 at ISWC’06
• Evaluation initiative in Ontology matching• Since 2004• In 2006 OAEI workshop at Ontology matching
workshop, ISWC• Four tracks (six data sets)
– Benchmark (biblio), – Expressive ontologies: anatomy (2 ontologies 10k
classes), jobs (jobs and jobs seekers, real world case)– Directory (web sites directory) – 4 thousand
elementary test, Food data set– SKOS thesaurus about food with other food ontologies
12th of October, 2006 KEG seminar 16
Conference track
• Coordinated by UEP
• Free exploration by participants within 10 ontologies
• Domain: conference organisation
• No a priori reference alignment
• Participants: 6 research groups
12th of October, 2006 KEG seminar 17
Ontologies in track
12th of October, 2006 KEG seminar 18
Participants (1)
• Combination of methods: lexicographic and contextual• ISLab
– 1:1 matching approach– Linguistic technique - thesaurus of terms and weighted
terminological relationships is exploited– Contextual technique - semantic relation in an ontology
• RiMOM– Ontology alignment defined as a directional one– Matchers: Name-based (also NLP methods), Instance-based,
Description-based, Taxonomy context-based, Constraints-based
• CtxMatch – DL formulas– Not only eq., also subsumption, disjointness, intersection
12th of October, 2006 KEG seminar 19
Participants (2)
• COMA++– Extension of COMA
• Automs– Lexical matching method, LSI, structural
matching algorithm
• Falcon– elementary matchers: string-based, graph-
based
12th of October, 2006 KEG seminar 20
Evaluation (1)
• Personal judgement of organisers
• interesting individual correspondences (inverse compound names, eg. PC_Member = Member_PC), synonyms
• Mapping errors: subsumption, inversion role, siblings, lexical confusion
• Mapping between relation and class, eg. has_an_email and E-mail
12th of October, 2006 KEG seminar 21
Evaluation (2)
12th of October, 2006 KEG seminar 22
Evaluation (3)
• Subsumption error– Author,Paper_Author– Conference_Trip, Conference_part
• Inversion role error– abstract_of_paper,reviewerOfPaper error
• Siblings– ProgramCommittee,Technical_commitee
• Lexical confusion error– program,Program_chair
• Relation – Class mapping– has_enddate,Date– hasTitle,Title; hasSurname,Surname
12th of October, 2006 KEG seminar 23
Evaluation (4)
12th of October, 2006 KEG seminar 24
Summary
• How to evaluate this track?– Interesting mappings
• Recall?