A Classification of Schema-based Matching Approaches

Embed Size (px)

DESCRIPTION

A Classification of Schema-based Matching Approaches. Pavel Shvaiko. Meaning Coordination and Negotiation Workshop, ISWC 8 th November 2004, Hiroshima, Japan. Introduction Classification of schema-based matching approaches Matching systems Conclusions Future work. Outline. - PowerPoint PPT Presentation

Text of A Classification of Schema-based Matching Approaches

  • A Classification of Schema-based Matching ApproachesPavel ShvaikoMeaning Coordination and Negotiation Workshop, ISWC 8th November 2004, Hiroshima, Japan

    The Italian-Israeli Forum on Computer Science, Haifa, June 17-18, 2003

    MCN workshop, ISWC, 8th November 2004, Hiroshima, Japan

    Outline Introduction Classification of schema-based matching approaches Matching systems Conclusions Future work

    The Italian-Israeli Forum on Computer Science, Haifa, June 17-18, 2003

    MCN workshop, ISWC, 8th November 2004, Hiroshima, Japan

    Introduction

    The Italian-Israeli Forum on Computer Science, Haifa, June 17-18, 2003

    MCN workshop, ISWC, 8th November 2004, Hiroshima, Japan

    Semantic Web and the Match operatorInformation sources (e.g., database schemas, taxonomies or ontologies) can be viewed as graph-like structures containing terms and their inter-relationships

    Match is one of the key operators for enabling the Semantic Web since it takes two graph-like structures and produces a mapping between the nodes of the graphs that correspond semantically to each other

    The Italian-Israeli Forum on Computer Science, Haifa, June 17-18, 2003

    MCN workshop, ISWC, 8th November 2004, Hiroshima, Japan

    Example: Two XML schemasHTFT

    The Italian-Israeli Forum on Computer Science, Haifa, June 17-18, 2003

    MCN workshop, ISWC, 8th November 2004, Hiroshima, Japan

    Schema matching vs Ontology alignmentDifferences:Database schemas often do not provide explicit semantics for their data Ontologies are logical systems that themselves incorporate semantics (intuitive or formal)E.g., ontology definitions as a set of logical axiomsOntology data models are richer (the number of primitives is higher, and they are more complex) then schema data modelsE.g., OWL allows defining new classes as unions or intersections of other classesCommonalities:Ontologies can be viewed as schemas for knowledge basesTechniques developed for both problems are of a mutual benefit

    The Italian-Israeli Forum on Computer Science, Haifa, June 17-18, 2003

    MCN workshop, ISWC, 8th November 2004, Hiroshima, Japan

    Matching

    The Italian-Israeli Forum on Computer Science, Haifa, June 17-18, 2003

    MCN workshop, ISWC, 8th November 2004, Hiroshima, Japan

    Classification of Schema-based Matching Approaches

    The Italian-Israeli Forum on Computer Science, Haifa, June 17-18, 2003

    MCN workshop, ISWC, 8th November 2004, Hiroshima, Japan

    Schema matching approachesCombined matchersTaxonomy from [E. Rahm, P. Bernstein, 2001]

    The Italian-Israeli Forum on Computer Science, Haifa, June 17-18, 2003

    MCN workshop, ISWC, 8th November 2004, Hiroshima, Japan

    Semantic view on matching

    Heuristic vs formal:heuristic techniques try to guess relations which may hold between similar labels or graph structuresformal techniques have model-theoretic semantics which is used to justify their results

    Implicit vs explicit: Implicit techniques are syntax driven techniquesE.g., techniques, which consider labels as strings, or analyze data types, or soundex of schema/ontology elements Explicit techniques exploit the semantics of labelsE.g., thesauruses, ontologiesWhat is missing in the taxonomy of schema matching approaches we have just seen ? Two new criteria:

    The Italian-Israeli Forum on Computer Science, Haifa, June 17-18, 2003

    MCN workshop, ISWC, 8th November 2004, Hiroshima, Japan

    Schema Matching Approaches

    The Italian-Israeli Forum on Computer Science, Haifa, June 17-18, 2003

    MCN workshop, ISWC, 8th November 2004, Hiroshima, Japan

    Schema-based Matching Approaches

    The Italian-Israeli Forum on Computer Science, Haifa, June 17-18, 2003

    MCN workshop, ISWC, 8th November 2004, Hiroshima, Japan

    Heuristic TechniquesElement-level explicit techniques Precompiled dictionary (Cupid, COMA)E.g., syn key - "NKN:Nikon = synLexicons (S-Match, CTXmatch) E.g., WordNet: Camera is a hypernym for Digital Camera, therefore, Digital_Cameras Photo_and_Cameras Structure-level explicit techniquesTaxonomic structure (Anchor-Prompt, NOM)E.g., Given that Digital_Cameras Photo_and_Cameras, then FJFLM and FujiFilm can be found as an appropriate match Example

    The Italian-Israeli Forum on Computer Science, Haifa, June 17-18, 2003

    MCN workshop, ISWC, 8th November 2004, Hiroshima, Japan

    Formal TechniquesExample

    Structure-level explicit techniques Propositional satisfiability (SAT) (S-Match, CTXmatch)The approach is to translate the matching problem, namely the two graphs (trees) and mapping queries into propositional formula and then to check it for its validityModal SAT (S-Match) The idea is to enhance propositional logics with modal logic (or ALC DL) operators. Therefore, the matching problem is translated into a modal logic formula which is further checked for its validity using sound and complete satisfiability search procedures.

    The Italian-Israeli Forum on Computer Science, Haifa, June 17-18, 2003

    MCN workshop, ISWC, 8th November 2004, Hiroshima, Japan

    Matching Systems

    The Italian-Israeli Forum on Computer Science, Haifa, June 17-18, 2003

    MCN workshop, ISWC, 8th November 2004, Hiroshima, Japan

    Characteristics of state of the art matchers

    Conclusions

    The Italian-Israeli Forum on Computer Science, Haifa, June 17-18, 2003

    MCN workshop, ISWC, 8th November 2004, Hiroshima, Japan

    Uses of Classification

    The classification proposed provides a common conceptual basis, and hence can be used for comparing (analytically) different existing schema/ontology matching systems

    It can help in designing a new matching system, or an elementary matcher, taking advantages of state of the art solutions

    The Italian-Israeli Forum on Computer Science, Haifa, June 17-18, 2003

    MCN workshop, ISWC, 8th November 2004, Hiroshima, Japan

    Future Work

    Provide a more detailed view on the general properties of matching algorithmsAdd to the classification language-based techniques, e.g., tokenization, lemmatization, eliminationExtend classification by taking into account DL-based matchmaking solutionsExtend classification by adding new appearing matching techniques and systems implementing them, e.g., OLA, QOMCompare matching systems also experimentally, with the help of benchmarks

    The Italian-Israeli Forum on Computer Science, Haifa, June 17-18, 2003

    MCN workshop, ISWC, 8th November 2004, Hiroshima, Japan

    References

    Knowledge Web project: http://knowledgeweb.semanticweb.org/Project website at DIT - ACCORD: http://www.dit.unitn.it/~accord/P. Shvaiko: A classification of schema-based matching approaches. Technical Report, DIT-04-93, University of Trento, 2004.E. Rahm, P. Bernstein: A survey of approaches to automatic schema matching. In Very Large Databases Journal, 10(4):334-350, 2001. F. Giunchiglia, P.Shvaiko: Semantic matching. In The Knowledge Engineering Review Journal, 18(3):265-280, 2003.P. Bouquet, L. Serafini, S. Zanobini: Semantic coordination: a new approach and an application. In Proceedings of ISWC, 130-145, 2003.

    The Italian-Israeli Forum on Computer Science, Haifa, June 17-18, 2003

    MCN workshop, ISWC, 8th November 2004, Hiroshima, Japan

    Thank you!

    The Italian-Israeli Forum on Computer Science, Haifa, June 17-18, 2003