1 Berendt: Knowledge and the Web, 2014, berendt/teaching 1 Knowledge and the Web – Schema, instance and ontology matching Bettina

1Berendt: Knowledge and the Web, 2014, http://www.cs.kuleuven.be/~berendt/teaching

1

Knowledge and the Web –

Schema, instance and ontology matching

Bettina Berendt

KU Leuven, Department of Computer Science

http://www.cs.kuleuven.be/~berendt/teaching/2014-15-1stsemester/kaw/

Last update: 22 October 2014






2

Until now ...

... we have looked into modelling ... we have seen how the languages RDF(S) and OWL allow

us to combine different schemas and data ... we have seen how Linked Data on the Web uses HTTP as a

connecting protocol/architecture ... we have assumed that such combinations can be done

effortlessly (unique names etc.) ... we have looked at some interpretation problems

associated with these procedures Now we need to ask:

What are (further) challenges of such combinations? What are approaches proposed to solve it?

– from the databases & the Semantic Web / ontologies fields

– from architectural and logical points of view


3Motivation 1: Price comparison engines search & combine heterogeneous travel-agency DBs, which seach & combine heterogeneous airline DBs


4

Motivation 2a: Schemas coming from different languages

A river is a natural stream of water, usually freshwater, flowing toward an ocean, a lake, or another stream. In some cases a river flows into the ground or dries up completely before reaching another body of water. Usually larger streams are called rivers while smaller streams are called creeks, brooks, rivulets, rills, and many other terms, but there is no general rule that defines what can be called a river. Sometimes a river is said to be larger than a creek,[1] but this is not always the case.[2]

Une rivière est un cours d'eau qui s'écoule sous l'effet de la gravité et qui se jette dans une autre rivière ou dans un fleuve, contrairement au fleuve qui se jette, lui, dans la mer ou dans l'océan.

Een rivier is een min of meer natuurlijke waterstroom. We onderscheiden oceanische rivieren (in België ook wel stroom genoemd) die in een zee of oceaan uitmonden, en continentale rivieren die in een meer, een moeras of woestijn uitmonden. Een beek is de aanduiding voor een kleine rivier. Tussen beek en rivier ligt meestal een bijrivier.

http://en.wikipedia.org/wiki/Stream

http://en.wikipedia.org/wiki/Water

http://en.wikipedia.org/wiki/Freshwater

http://en.wikipedia.org/wiki/Ocean

http://en.wikipedia.org/wiki/Lake

http://en.wikipedia.org/wiki/River#cite_note-0

http://en.wikipedia.org/wiki/River#cite_note-1

http://fr.wikipedia.org/wiki/Cours_d'eau

http://fr.wikipedia.org/wiki/Fleuve

http://nl.wikipedia.org/wiki/Waterstroom

http://nl.wikipedia.org/wiki/Zee

http://nl.wikipedia.org/wiki/Oceaan

http://nl.wikipedia.org/wiki/Meer_(water)

http://nl.wikipedia.org/wiki/Moeras

http://nl.wikipedia.org/wiki/Woestijn

http://nl.wikipedia.org/wiki/Beek_(stroom)


5Motivation 2b: Information about “the same“ thing from different sources


6

Motivation 3a: Are these the same entity?


7

Motivation 3b: „Who is that?“ – Merging identities

Mickey Mouse


8

Motivation 3c: „Who was that?“ – Re-identification


9

High-level overview: Goals and approaches in data integration

Basic goal: Combine data/knowledge from different sources

Goal / emphasis can lie on finding correspondences between the models schema matching, ontology matching

the instances record linkage

Techniques can leverage similarities between schema/ontology-level information

instance information

most of today

An established problem in DB; a focus &challenge for LOD (“owl:sameAs“)


10

Agenda

The match problem & what info to use for matching

(Semi-)automated matching: Example CUPID

(Semi-)automated matching: Example iMAP

(Automated) matching of LOD and LOD ontologies

Evaluating matching

Involving the user: Explanations; mass collaboration

If time permits,these 2 topics

too (briefly)


11

The match problem(Running example 1)

Given two schemas S1 and S2, find a mapping between elements of S1 and S2 that correspond semantically to each other


12

Running example 2


13

Based on what information can the matchings/mappings be found?

(work on the two running examples)


14

The match operator

Match operator: f(S1,S2) = mapping between S1 and S2 for schemas S1, S2

Mapping a set of mapping elements

Mapping elements elements of S1, elements of S2, mapping expression

Mapping expression different functions and relationships


15

Matching expressions: examples

Scalar relations (=, ≥, ...) S.HOUSES.location = T.LISTINGS.area

Functions T.LISTINGS.list-price = S.HOUSES.price * (1+S.AGENTS.fee-rate) T.LISTINGS.agent-address = concat(S.AGENTS.city,S.AGENTS.state)

ER-style relationships (is-a, part-of, ...) Set-oriented relationships (overlaps, contains, ...) Any other terms that are defined in the expression language used


16

Matching and mapping

1. Find the schema match („declarative“)

2. Create a procedure (e.g., a query expression) to enable automated data translation or exchange (mapping, „procedural“)

Example of result of step 2: To create T.LISTINGS from S (simplified notation):

area = SELECT location FROM HOUSES

agent-name = SELECT name FROM AGENTS

agent-address = SELECT concat(city,state) FROM AGENTS

list-price = SELECT price * (1+fee-rate)

FROM HOUSES, AGENTS

WHERE agent-id = id


17Based on what information can the matchings/mappings be found?

Rahm & Bernstein‘s classification of schema matching approaches


18

Challenges

Semantics of the involved elements often need to be inferred

Often need to base (heuristic) solutions on cues in schema and data, which are unreliable

e.g., homonyms (area), synonyms (area, location)

Schema and data clues are often incomplete e.g., date: date of what?

Global nature of matching: to choose one matching possibility, must typically exclude all others as worse

Matching is often subjective and/or context-dependent e.g., does house-style match house-description or not?

Extremely laborious and error-prone process e.g., Li & Clifton 2000: project at GTE telecommunications:

40 databases, 27K elements, no access to the original developers of the DB estimated time for just finding and documenting the matches: 12 person years

Ontologies often even bigger For example Cyc: now (as of 2012) has > 500,000 concepts, ~ 5,000,000

assertions, >26,000 relations


19

Semi-automated schema matching (1)

Rule-based solutions Hand-crafted rules Exploit schema information

+ relatively inexpensive

+ do not require training

+ fast (operate only on schema, not data)

+ can work very well in certain types of applications & domains

+ rules can provide a quick & concise method of capturing user knowledge about the domain

– cannot exploit data instances effectively

– cannot exploit previous matching efforts

(other than by re-use)


20

Semi-automated schema matching (2)

Learning-based solutions Rules/mappings learned from attribute specifications and statistics of

data content (Rahm&Bernstein: „instance-level matching“)

Exploit schema information and data Some approaches: external evidence

Past matches Corpus of schemas and matches („matchings in real-estate applications

will tend to be alike“) Corpus of users (more details later in this slide set)

+ can exploit data instances effectively

+ can exploit previous matching efforts

– relatively expensive

– require training

– slower (operate data)

– results may be opaque (e.g., neural network output) explanation components! (more details later)


21

Agenda





Evaluating matching



22

Overview (1)

Rule-based approach

Schema types: Relational, XML

Metadata representation: Extended ER

Match granularity: Element, structure

Match cardinality: 1:1, n:1


23

Overview (2)

Schema-level match: Name-based: name equality, synonyms, hypernyms,

homonyms, abbreviations

Constraint-based: data type and domain compatibility, referential constraints

Structure matching: matching subtrees, weighted by leaves

Re-use, auxiliary information used: Thesauri, glossaries

Combination of matchers: Hybrid

Manual work / user input: User can adjust threshold weights


24

Basic representation: Schema trees

Computation overview:

1. Compute similarity coefficients between elements of these graphs

2. Deduce a mapping from these coefficients


25

Computing similarity coefficients (1): Linguistic matching

Operates on schema element names (= nodes in schema tree)

1. Normalization Tokenization (parse names into tokens based on punctuation, case,

etc.)

e.g., Product_ID {Product, ID} Expansion (of abbreviations and acronyms) Elimination (of prepositions, articles, etc.)

2. Categorization / clustering Based on data types, schema hierarchy, linguistic content of names

e.g., „real-valued elements“, „money-related elements“

3. Comparison (within the categories) Compute linguistic similarity coefficients (lsim) based on thesarus

(synonmy, hypernymy)

Output: Table of lsim coefficients (in [0,1]) between schema elements


26

How to identify synonyms and homonyms: Example WordNet


27How to identify hypernyms: Example WordNet

What if you hadto match

“statement“ and “bill“?


28

(Lately also done with Wikipedia rather than with WordNet: e.g. WikiMatch)


29

Computing similarity coefficients (2): Structure matching

Intuitions: Leaves are similar if they are linguistic & data-type similar, and

if they have similar neighbourhoods

Non-leaf elements are similar if linguistically similar & have similar subtrees (where leaf sets are most important)

Procedure:

1. Initialize structural similarity of leaves based on data types

Identical data types: compat. = 0.5; otherwise in [0,0.5]

2. Process the tree in post-order

3. Stronglink(leaf1, leaf2) iff their weighted sim. ≥ threshold

4. .


30

The structure matching algorithm

Output: an 1:n mapping for leaves

To generate non-leaf mappings: 2nd post-order traversal


31

Matching shared types

Solution: expand the schema into a schema tree, then proceed as before

Can help to generate context-dependent mappings

Fails if a cycle of containment and IsDerivedFrom relationships is present (e.g., recursive type definitions)


32

Agenda





Evaluating matching



33

Main ideas

A learning-based approach

Main goal: discover complex matches In particular: functions such as

T.LISTINGS.list-price = S.HOUSES.price * (1+S.AGENTS.fee-rate)

T.LISTINGS.agent-address = concat(S.AGENTS.city,S.AGENTS.state)

Works on relational schemas

Basic idea: reformulate schema matching as search


34

Architecture

Specialized searchers are specialized on discovering certain types of complex matches make search more efficient


35

Overview of implemented searchers


36

Example: The textual searcher

For target attribute T.LISTINGS.agent-address: Examine attributes and concatenations of attributes from S Restrict examined set by analyzing textual properties

Data type information in schema, heuristics (proportion of non-numeric characters etc.)

Evaluate match candidates based on data correspondences, prune inferior candidates


37

Example: The numerical searcher

For target attribute T.LISTINGS.list-price:

Examine attributes and arithmetic expressions over them from S

Restrict examined set by analyzing numeric properties Data type information in schema, heuristics

Evaluate match candidates based on data correspondences, prune inferior candidates


38

Search strategy (1): Example textual searcher

1. Learn a (Naive Bayes) classifier

text class („agent-address“ or „other“)

from the data instances in T.LISTINGS.agent-address

2. Apply this classifier to each match candidate (e.g., location, concat(city,state)

3. Score of the candidate = average over instance probabilities

4. For expansion: beam search – only k-top scoring candiates


39

Search strategy (2): Example numeric searcher

1. Get value distributions of target attribute and each candidate

2. Compare the value distributions (Kullback-Leibler divergence measure)

3. Score of the candidate = Kullback-Leibler measure


40

Evaluation strategies of implemented searchers


41

Pruning by domain constraints

Multiple attributes of S: „attributes name and beds are unrelated“ do not generate match candidates with these 2 attributes

Properties of a single attribute of T: „the average value of num-rooms does not exceed 10“ use in evaluation of candidates

Properties of multiple attributes of T: „lot-area and num-baths are unrelated“ at match selector level, „clean up“:

Example

– T.num_baths S.baths

– ? T.lot-area (S.lot-sq-feet/43560)+1.3e-15 * S.baths

Based on the domain constraint, drop the term involving S.baths


42

Pruning by using knowledge from overlap data

When S and T share the same data

Consider fraction of data for which mapping is correct e.g., house locations:

S.HOUSES.location overlaps more with T.LISTINGS.area than with T.LISTINGS.agent-address

Discard the candidate T.LISTINGS.agent-address = S.HOUSES.location,

keep only T.LISTINGS.agent-address = concat(S.AGENTS.city,S.AGENTS,state)


43

Agenda





Evaluating matching



44

What is ontology matching (relative to schema matching)?

same basic idea but works on ontologies that are conceptual models (not on logical

schemas such as relational tables or XML trees) emphasizes that concepts and relations need to be matched and

mapped, and may treat these differently (Note: in the schema matching literature, it is not always clearly

laid out whether the matched items come from a conceptual or a logical model; the toy examples above in particular are also conceptual)

In practice, some ontology matching tasks in fact work on such simple models (or simple subparts of models) that they do not differ at all from what we have seen so far

example: Anatomy task, see below in evaluation

Terminology: Also known as ontology alignment See (Shvaiko & Euzenat, 2005) for more details


45Recap: Rahm & Bernstein‘s classification of schema matching approaches


46The methods that are important when the schema is in the foreground (which it is in ontologies!)


47

The extension by Shvaiko & Euzenat (2005) [Partial view]


48

(slide from last week)

Special challenges on LOD ?!


49

Using the example of Geonames and DBPedia:

1. Matching instances to generate owl:sameAs links

2. Discovering concepts that cover these instances to map between ontologies

What about matching/mapping instances and classes?


50

What can we infer from this ? (1)

<owl:Class rdf:ID="Boek"/> <owl:Class rdf:ID="Book"/> <owl:DatatypeProperty rdf:ID="ISBN"> <rdf:type rdf:resource="&owl;FunctionalProperty"/> <rdfs:domain rdf:resource="#Book"/> <rdfs:range rdf:resource="&xsd;string"/> </owl:DatatypeProperty> <owl:DatatypeProperty rdf:ID="isbn"> <rdf:type rdf:resource="&owl;FunctionalProperty"/> <rdfs:domain rdf:resource="#Boek"/> <rdfs:range rdf:resource="&xsd;string"/> </owl:DatatypeProperty>


51

What can we infer from this ? (2)

<Book rdf:ID="mybook1"> <ISBN rdf:datatype="&xsd;string">12345</ISBN> </Book> <Book rdf:ID="mybook2"> <ISBN rdf:datatype="&xsd;string">12345</ISBN> </Book> <Book rdf:ID="mybook3"> <ISBN rdf:datatype="&xsd;string">6789</ISBN> </Book> <Boek rdf:ID="mijnboek_3"> <isbn rdf:datatype="&xsd;string">6789</isbn> </Boek>


52What about this? (dbpedia: 526K geog. places/features, GeoNames: 7.8Mio geog. features)


53How this matching was done(http://lists.w3.org/Archives/Public/semantic-web/2006Dec/0027.html)

>> Around 100,000 geonames place names now have wikipedia links.

> Very cool. I wonder how you link the articles? Can't be simple word matching, no?

Simple word matching would lead to an incredible mess. There are for example 53 places with the name London and 58 places with the name Paris in the geonames database. Place name disambiguation is a rather hard problem and for matching geonames places with wikipedia articles we use semantic information in the wikipedia dump together with the article title. The semantic information primarily is latitude and longitude, but also country, administrative division, feature type, population and categories …. We only consider articles where we are able to parse semantic information .... Unfortunately there is a proliferation of templates and a lot of wikipedia users have fun inventing new ones instead of reusing existing ones.

http://lists.w3.org/Archives/Public/semantic-web/2006Dec/0027.html

http://lists.w3.org/Archives/Public/semantic-web/2006Dec/0027.html


54But what about the classes?


55

Concept covering: Motivation (Parundekar et al., 2012)

“The Web of Linked Data has grown significantly in the past few years – 31.6 billion triples as of September 2011. This includes a wide range of data sources from the government (42%), geographic (19.4%), life sciences (9.6%) and other domains.

A common way that the instances in these sources are linked to others is through the owl:sameAs property.

Though the size of Linked Data Cloud is increasing steadily (10% over the 28.5 billion triples in 2010), inspection of the sources at the ontology level reveals that only a few of them (15 out of the 190 sources) include mappings between their ontologies.

Since interoperability is crucial to the success of the Semantic Web, it is essential that these heterogeneous schemas, the result of a de-centralized approach to the generation of data and ontologies, also be linked.”


56

Challenges

The problem of finding alignments in ontologies of Linked Data sources is non-trivial, since there might not be one-to-one concept equivalences.

In some sources the ontology is extremely rudimentary, for example GeoNames has only one class : geonames:Feature

alignment with a well-defined ont. such as DBpedia is not particularly useful.

need to generate more expressive concepts. The necessary information to do this is often present in the properties and values of the instances in the sources.

For example, in GeoNames the values of the featureCode and featureClass properties provide useful concept constructors, which can be aligned with existing concepts in Dbpedia

the concept geonames:featureCode=P.PPL (populated place) aligns to dbpedia:City

Approach: explore the space of concepts defined by value restrictions, (“restriction classes”)


57

Restriction classes

Basic expression to define a restriction class:

p = v

• either p is an object property and v is a resource

• Ex.: rdf:type=City

• or p is a data property and v is a literal.

• Ex.: featureCode=P.PPL

• two restriction classes equal if their respective instance sets can be identified as equal after following the owl:sameAs links

Conjunctive and disjunctive restriction classes

Alignment algorithm for disjunctive restriction classes:

1. Find initial equivalence and subset relations

2. Discover concept coverings using disjunctions of restriction classes


58

Aligning atomic restriction classes (examples on the board)

Note: There are some typos in the paper. I switched the conclusions of the first 2 if-branches.Also, the cardinality of Img(r1) in the example on p.4 should be 3918


59

Identifying concept coverings


60

Results


61

Claim – can you comment?

„An interesting outcome of our algorithm is that it identifies inconsistencies and possible errors in the linked data, and provides a method for automatically curating the Linked Data Cloud.”


62

Part of the evaluation


71

Q: “Is this a publicly available tool?“

Not all schema/ontology matchers are available, for many reasons (proprietary, collaboration with a company, own start-up, ..., the Phd student left the institute and nobody understands the code ...)

Increasingly, though, it is seen as good practice by researchers to make their tools available. You can see how (some of) these tools perform by checking the Ontology Alignment Evaluation Initative pages (see part “Evaluating matching“)

Examples:

COMA (database schemas and ontologies) http://dbs.uni-leipzig.de/Research/coma.html

Falcon-OA (RDF(S) and OWL) http://ws.nju.edu.cn/falcon-ao/

LogMap (reasoning-based) http://www.cs.ox.ac.uk/isg/tools/LogMap/

“50 Ontology Mapping and Alignment Tools - More Than 20 Are Currently Active and Often in Open Source”: overview at http://www.mkbergman.com/1769/50-ontology-mapping-and-alignment-tools/

http://dbs.uni-leipzig.de/Research/coma.html



http://ws.nju.edu.cn/falcon-ao/

http://ws.nju.edu.cn/falcon-ao/

http://www.cs.ox.ac.uk/isg/tools/LogMap/

http://www.cs.ox.ac.uk/isg/tools/LogMap/

http://www.mkbergman.com/1769/50-ontology-mapping-and-alignment-tools/




72

Outlook





Evaluating matching



73

How to compare?

Input: What kind of input data? (What languages? Only toy examples? What external information?)

Output: mapping between attributes or tables, nodes or paths? How much information does the system report?

Quality measures: metrics for accuracy and completeness?

Effort: how much savings of manual effort, how quantified? Pre-match effort (training of learners, dictionary preparation, ...)

Post-match effort (correction and improvement of the match output)

How are these measured?


74

Match quality measures

Need a „gold standard“ (the „true“ match)

Measures from information retrieval:

(standard choice: F1, a = 0.5)

Quantifies post-match effort


75

Benchmarking

Do, Melnik, and Rahm (2003) found that evaluation studies were not comparable

Need more standardized conditions (benchmarks)

Since 2004: competitions in ontology matching (more in the next session):

Test cases and contests at http://www.ontologymatching.org/evaluation.html

http://www.ontologymatching.org/evaluation.html


76Example: Tasks 2009 (various are re-used; 2013 is just out)

(excerpt; from http://oaei.ontologymatching.org/2009/); latest completed run at http://oaei.ontologymatching.org/2013/

Expressive ontologies anatomy

The anatomy real world case is about matching the Adult Mouse Anatomy (2744 classes) and the NCI Thesaurus (3304 classes) describing the human anatomy.

conference Participants will be asked to find all correct correspondences (equivalence and/or

subsumption correspondences) and/or 'interesting correspondences' within a collection of ontologies describing the domain of organising conferences (the domain being well understandable for every researcher). Results will be evaluated a posteriori in part manually and in part by data-mining techniques and logical reasoning techniques. There will also be evaluation against reference mapping based on subset of the whole collection.

Directories and thesauri fishery gears

features four different classification schemes, expressed in OWL, adopted by different fishery information systems in FIM division of FAO. An alignment performed on this 4 schemes should be able to spot out equivalence, or a degree of similarity between the fishing gear types and the groups of gears, such to enable a future exercise of data aggregation cross systems.

Oriented matching This track focuses on the evaluation of alignments that contain other mapping

relations than equivalences.

Instance matching very large crosslingual resources

The purpose of this task (vlcr) is to match the Thesaurus of the Netherlands Institute for Sound and Vision (called GTAA, see below for more information) to two other resources: the English WordNet from Princeton University and DBpedia.

http://oaei.ontologymatching.org/2009/

http://oaei.ontologymatching.org/2009/anatomy/

http://oaei.ontologymatching.org/2009/conference/

http://people.kmi.open.ac.uk/claudio/OAEI2009.html

http://oaei.ontologymatching.org/2009/vlcr/


77

Mice and humans

The anatomy real world case is about matching the Adult Mouse Anatomy (2744 classes) and the NCI Thesaurus (3304 classes) describing the human anatomy.

(http://oaei.ontologymatching.org/2008/anatomy/)

http://oaei.ontologymatching.org/2008/anatomy/


78Matching task and evaluation approach(http://oaei.ontologymatching.org/2007/anatomy/)

We would like to gratefully thank Martin Ringwald and Terry Hayamizu (Mouse Genome Informatics - http://www.informatics.jax.org/), who provided us with a reference mapping for these ontologies.

The reference mapping contains only equivalence correspondences between concepts of the ontologies. No correspondences between properties (roles) are specified.

If your system also creates correspondences between properties or correspondences that describe subsumption relations, these results will not influence the evaluation (but can nevertheless be part of your submitted results).

The results of your matching system will be compared to this reference alignment. Therefore, all of the the results have to be delivered in the format specified here.

http://oaei.ontologymatching.org/2006/align.html


79Matching task and evaluation approach (http://oaei.ontologymatching.org/2011/oriented/index.html)

“An increasing number of matchers are now capable of deriving mapping relations other than equivalence relations, such as subsumption, disjointness or named relations.

This is a necessity given that we need to compute alignments between ontologies at different granularity levels or between ontologies that elaborate on non-equivalent elements. The evaluation of such mappings was addressed already in OAEI (2009) Oriented Matching track. […]

The track aims also to report on evaluation methods and measures for subsumption mappings, in conjunction to the computation of equivalence mappings.

Targeting these goals, we have built new benchmark datasets that are described below.”

http://oaei.ontologymatching.org/2011/oriented/index.html


80(Some) results(http://oaei.ontologymatching.org/2009/results/anatomy/)

http://oaei.ontologymatching.org/2009/results/anatomy/


81(Some) results(http://oaei.ontologymatching.org/2013/results/anatomy/)





83

Outlook





Evaluating matching



84

Example in iMAP

User sees ranked candidates:

1. List-price = price

2. List-price = price * (1 + fee-rate)

Explanation:

a) Both generated from numeric searcher, 2 ranked higher than 1

b) But:

c) Match month-posted = fee-rate

d) domain constraint: matches for month-posted and price do not share attributes

e) cannot match list-price to anything to do with fee-rate

f) Why c)?

g) Data instances of fee-rate were classified as of type date

User corrects this wrong step f), the rest is repaired accordingly


85

Background knowledge structure for explanation: dependency graph


86

MOBS: Using mass collaboration to automate data integration

1. Initialization: a correct but partial match (e.g. title = a1, title = b2, etc.)

2. Soliciting user feedback: User query user must answer a simple question user gets answer to initial query

3. Computing user weights (e.g., trustworthiness = fraction of correct answers to known mappings)

4. Combining user feedback (e.g, majority count) Important: „instant gratification“ (e.g., include the new field in the

results page after a user has given helpful input)


87

Task for next week (from http://opendefinition.org/)

Do you see a statement in this definition that does not appear substantiated?

Can you give 3 reasons why it may be true?

Can you give 3 reasons why it may be false?


88

.... which stands in some relation with these claims ...

„An interesting outcome of our algorithm is that it identifies inconsistencies and possible errors in the linked data, and provides a method for automatically curating the Linked Data Cloud.”


89

Outlook





Evaluating matching


Invited lecture Aad Versteden (Tenforce)


90

References / background reading; acknowledgements

Rahm, E. & Bernstein, P.A. (2001). A survey of approaches to automatic schema matching. The VLBD Journal, 10, 334-350.

http://citeseerx.ist.psu.edu/viewdoc/summary?doi=10.1.1.16.700

Doan, A. & Halevy, A.Y. (2004). Semantic Integration Research in the Database Community: A brief survey. AI Magazine.

http://dit.unitn.it/~p2p/RelatedWork/Matching/si-survey-db-community.pdf

Sven Hertling, Heiko Paulheim . WikiMatch - using Wikipedia for ontology matching. In Proc. of The Seventh International Workshop on Ontology Matching, 2012. http://www.dit.unitn.it/~p2p/OM-2012/om2012_Tpaper4.pdf

Madhavan, J., Bernstein, P.A., Rahm, E. (2001). Generic Schema Matching with Cupid. In Proc. Of the 27th VLDB Conference.

http://dbs.uni-leipzig.de/de/publication/title/generic_schema_matching_with_cupid

Dhamankar, R., Lee, Y., Doan, A., Halevy, A., & Domingos, P. (2004). iMAP: Discovering complex semantic matches between database schemas. In Proc. Of SIGMOD 2004.


P. Shvaiko, J. Euzenat: A Survey of Schema-based Matching Approaches. Journal on Data Semantics, 2005.

http://www.dit.unitn.it/~p2p/RelatedWork/Matching/JoDS-IV-2005_SurveyMatching-SE.pdf

pp. 50ff.: Bizer, C., Cyganiak, R., & Heath, T. (2007). How to Publish Linked Data on the Web. Chapter 6. How to set RDF Links to other Data Sources. http://wifo5-03.informatik.uni-mannheim.de/bizer/pub/LinkedDataTutorial/#links

pp. 55ff,; Rahul Parundekar, Craig A. Knoblock, and José Luis Ambite. Discovering concept coverings in ontologies of linked data sources. In Proceedings of the 11th International Semantic Web Conference (ISWC 2012), pp. 427–443, Boston, Mass., 2012. http://iswc2012.semanticweb.org/sites/default/files/76490417.pdf

Do, H.-H., Melnik, S., & Rahm, E. (2003). Comparison of schema matching evaluations. In Web, Web-Services, and Database Systems: NODe 2002, Web- and Database-Related Workshops, Erfurt, Germany, October 7-10, 2002. Revised Papers (pp. 221-237). Springer.

http://dit.unitn.it/~p2p/RelatedWork/Comparison%20of%20Schema%20Matching%20Evaluations.pdf

McCann, R., Doan, A., Varadarajan, V., & Kramnik, A. (2003). Building data integration systems via mass collaboration. In Proc. International Workshop on the Web and Databases (WebDB).


Please see the Powerpoint slide-specific „notes“ for URLs of used pictures and formulae


http://dit.unitn.it/~p2p/RelatedWork/Matching/si-survey-db-community.pdf

http://www.dit.unitn.it/~p2p/OM-2012/om2012_Tpaper4.pdf

http://dbs.uni-leipzig.de/de/publication/title/generic_schema_matching_with_cupid




http://wifo5-03.informatik.uni-mannheim.de/bizer/pub/LinkedDataTutorial/#links

http://wifo5-03.informatik.uni-mannheim.de/bizer/pub/LinkedDataTutorial/#links

http://iswc2012.semanticweb.org/sites/default/files/76490417.pdf

http://iswc2012.semanticweb.org/sites/default/files/76490417.pdf

http://dit.unitn.it/~p2p/RelatedWork/Comparison%20of%20Schema%20Matching%20Evaluations.pdf


Documents

1 Berendt: Knowledge and the Web, 2014, berendt/teaching 1 Knowledge and the Web – Schema, instance and ontology matching Bettina