19
p. 1 Lutz Maicher ([email protected]) Topic Maps Exchange in the Absence of Shared Vocabularies Topic Maps Exchange in the Absence of Shared Vocabularies TMRA'05 International Workshop on Topic Maps Research and Applications 06.10.2005 Lutz Maicher University of Leipzig [email protected]

P. 1 Lutz Maicher ([email protected]) Topic Maps Exchange in the Absence of Shared Vocabularies TMRA'05 International Workshop on Topic

  • View
    217

  • Download
    2

Embed Size (px)

Citation preview

p. 1

Lu

tz M

aic

her

(maic

her@

info

rmati

k.u

ni-

leip

zig

.de)

Topic Maps Exchange in the Absence of Shared Vocabularies

Topic Maps Exchange in the Absence of Shared Vocabularies

TMRA'05International Workshop on Topic Maps

Research and Applications06.10.2005

Lutz MaicherUniversity of Leipzig

[email protected]

p. 2

Lu

tz M

aic

her

(maic

her@

info

rmati

k.u

ni-

leip

zig

.de)

Topic Maps Exchange in the Absence of Shared Vocabularies

Topic Maps Exchange = Retrieval Task

1. Subject Proxies are createdin a remote environment.

requested peer

requested peer

requesting peer

?

?

2. A requesting peer requests further information about a Subject in interest.

3. The requested peers have to decide whether a Subject Proxy indicating anidentical Subject is available.

none

4. Requested peers send a fragment to requesting peer.

5. Requesting peer has to merge in the requested fragments.

p. 3

Lu

tz M

aic

her

(maic

her@

info

rmati

k.u

ni-

leip

zig

.de)

Topic Maps Exchange in the Absence of Shared Vocabularies

Enterprise Information Integration

Lu

tz M

aic

her

(maic

her@

info

rmati

k.u

ni-

leip

zig

.de)

Quelle: Taylor, John: Thoughts from the Integration Consortium: Enterprise Information Integration: A New Definition, DM Review Online, (9,2004).

p. 4

Lu

tz M

aic

her

(maic

her@

info

rmati

k.u

ni-

leip

zig

.de)

Topic Maps Exchange in the Absence of Shared Vocabularies

Existing Approaches to Topic Maps Exchange● TMRAP – Topic Maps Remote Access Protocol● TMIP – the REStful Topic Maps Interaction Protocol

(formerly: Federated Topic Maps)

● SHARK(alternatively: Knowledge Port Approach)

● TMShare

● all of them base on the TMDM – if distributed peers do not use a common

vocabulary (PSIs), the exchange fails completely

p. 5

Lu

tz M

aic

her

(maic

her@

info

rmati

k.u

ni-

leip

zig

.de)

Topic Maps Exchange in the Absence of Shared Vocabularies

Semantics in Topic Maps

● Topic Maps are a semantic technology ......only in the perspective of information integration

– „Subject Proxies indicating identical Subjects have to be viewed as merged ones“

● A Subject Map Disclosure (SMD) discloses:– SMD ontology

● implies the Subject Indication Approach

– Subject Equality Decision Approach● define the semantics of the given Subject Proxies in

respect to the functionality of holding the Co-Location objective true

– Subject Viewing Approach

p. 6

Lu

tz M

aic

her

(maic

her@

info

rmati

k.u

ni-

leip

zig

.de)

Topic Maps Exchange in the Absence of Shared Vocabularies

How Subject Equality is detected?

Subject Equality SMDi (

Subject Identity under integration perspective?

Subject Equality = both Subject Proxies indicate identical Subjects governed by the Subject Equality Decision Approach SMDi

Subject Indication SMD1 (Subject IdentitySubject Stage1),

Subject Indication SMD2 (Subject IdentitySubject Stage2))

Subject Identity integration perspective( Subject Stage1, Subject Stage2)

Subject Identity is indicated governed by the Subject Indication Approach SMD1

p. 7

Lu

tz M

aic

her

(maic

her@

info

rmati

k.u

ni-

leip

zig

.de)

Topic Maps Exchange in the Absence of Shared Vocabularies

How Subject Equality is really detected?

Subject Equality SMDi (

Subject Equality = both Subject Proxies indicate identical Subjects governed by the Subject Equality Decision Approach SMDi

Subject Indication SMD1,

Subject Indication SMD2,

Subject IndicationSMD1

Subject Map Subject Proxy1, Subject Map Subject Proxy2) true | false

Subject IndicationSMD2?

?

p. 8

Lu

tz M

aic

her

(maic

her@

info

rmati

k.u

ni-

leip

zig

.de)

Topic Maps Exchange in the Absence of Shared Vocabularies

Possible Subject Equality Approaches of a SMD

Referential Subject Equality Approach[A reference to a discrete ‘object’ indicates the intended Subject.] - Subject Proxy 1 indicates its Subject by pointing to it with S1 - Subject Proxy 2 indicates its Subject by pointing to it with S2 - Subject Equality holds if S1=S2Structuralist Subject Equality Approach[The Subject depends on other Subject Proxies of the Subject Map.] - Subject Proxy 1 indicates its Subject through a set of Subject Proxies s1 - Subject Proxy 2 indicates its Subject through a set of Subject Proxies s2 - Subject Equality holds if s1 = s2 (or S1 similar S2)

Meaning (semantics) in linguistics

referential semantics The meaning of word is defined by the object it refers to.

structuralist semantics The meaning of a word is defined by its usage in the language.

The different Approaches to Subject Equality define the semanticsof the used vocabulary at the time of the Subject Equality Decision.

p. 9

Lu

tz M

aic

her

(maic

her@

info

rmati

k.u

ni-

leip

zig

.de)

Topic Maps Exchange in the Absence of Shared Vocabularies

Absence of Shared Vocabularies

Topic Map Processing Application

Subject Map Disclosure ontology

Subject Map ontology

Subject Map Vocabulary

Subject Map Disclosure (SMD)

Structuralist Subject Equality Decision

Referential Subject Equality DecisionReferential Subject Equality Decision

p. 10

Lu

tz M

aic

her

(maic

her@

info

rmati

k.u

ni-

leip

zig

.de)

Topic Maps Exchange in the Absence of Shared Vocabularies

Towards a SMDSIM

Topic Map Processing Application

Subject Map Disclosure ontology

Subject Map ontology

Subject Map vocabulary

Subject Map Disclosure (SMD)

Structuralist Subject Equality Decision

Referential Subject Equality Decision

Structuralist Subject Equality Decision

p. 11

Lu

tz M

aic

her

(maic

her@

info

rmati

k.u

ni-

leip

zig

.de)

Topic Maps Exchange in the Absence of Shared Vocabularies

Subject Similarity Measure (SIM)● SIM – Similarity of the Subject of two different

Topics● Procedure: a Subject available in Topic Map TM2

will be requested from Topic Map TM1– Extract a Topic Map Fragment (F) from TM2 around the Topic

representing the Subject

– for each pair (T1, T2) from TM1, F

● depict the simDNAtype for each pair● calculate the simDNA for each pair● calculate the simDNA twice, by using the detected similarity

from the first step● simDNA’(T1,T2) = sum of digits (simDNA(T1,T2))

– Subject Equality (T1,T2) -> (max simDNA’(T1,T2)) and (simDNA(T1,T2))>threshold

p. 12

Lu

tz M

aic

her

(maic

her@

info

rmati

k.u

ni-

leip

zig

.de)

Topic Maps Exchange in the Absence of Shared Vocabularies

simDNAtype

(0..*) Source Locator [Locator Item]

(0..1) Subject Locator [Locator Item]

(0..1) Subject Identifier [Locator Item]

(0..*) Topic Names [Topic Name Item]

(0..*) Source Locator [Locator Item]

(0..1) Type [Topic Item]

(0..*) Scope [Topic Item]

(1) Value [String]

(0..*) Variants [Variant Items]

(0..*) Source Locators [Locator Item]

(0..*) Scope [Topic Item]

(0..1) Value [String]

(0..1) Resource [Locator Item]

(0..*) Occurrences [Occurrence Item]

(0..*) Source Locators [Locator Item]

(0..1) Type [Topic Item]

(0..*) Scope [Topic Item]

(0..1) Value [String]

(0..1) Resource [Locator Item]

(0..*) rolesPlayed [Association Role Item]

(0..1) Type [Topic Item]

(1) Parent [Association Item]

TMDM simDNAType

/x*y*z*w*s*1*2*3*t*n*(o)*[a]*/

x – the current Topic is typing a Topic

y – the current Topic is typing an Association

z – the current Topic is typing a Topic Characteristics

w – the current Topic is typing a Association Role

s – the current Topic is scoping a Topic Characteristic

1 – the current Topic has a Source Locator

2 – the current Topic has a Subject Locator

3 – the current Topic has a Subject Identifier

t – the current Topic is typed

n – the current Topic has a TopicName

o – the current Topic has an Occurrence

o => /(v|l)t?s*/ (OccDNAtype)

a – the current Topic takes part in an Association

a => /a(tp)*/ (AssDNAtype)

p. 13

Lu

tz M

aic

her

(maic

her@

info

rmati

k.u

ni-

leip

zig

.de)

Topic Maps Exchange in the Absence of Shared Vocabularies

simDNA – 1. Iteration

simDNAType

/x*y*z*w*s*1*2*3*t*n*(o)*[a]*/

x – the current Topic is typing a Topic

y – the current Topic is typing an Association

z – the current Topic is typing a Topic Characteristics

w – the current Topic is typing a Association Role

s – the current Topic is scoping a Topic Characteristic

1 – the current Topic has a Source Locator

2 – the current Topic has a Subject Locator

3 – the current Topic has a Subject Identifier

t – the current Topic is typed

n – the current Topic has a TopicName

o – the current Topic has an Occurrence

o => /(v|l)t?s*/ (OccDNAtype)

a – the current Topic takes part in an Association

a => /a(tp)*/ (AssDNAtype)

Example

simDNAtype(T1) = x13tn

x – the current Topic is typing a Topic

1 – the current Topic has a Source Locator

2 – the current Topic has a Subject Locator

3 – the current Topic has a Subject Identifier

t – the current Topic is typed

n – the current Topic has a Topic Name

simDNA(T1,T2) = 01XX1

T2 types an Association

T2 has a Source Locator

T2 has none Subject Identifier

T2 is not typed

T2 has a Topic Name, which is not similar

simDNA(T1,T3) = 21113

T2 types a Topic

T2 has a Source Locator

T2 has a Subject Identifier

T2 is typed

T2 has a Topic Namen, which is a “bit” similar

p. 14

Lu

tz M

aic

her

(maic

her@

info

rmati

k.u

ni-

leip

zig

.de)

Topic Maps Exchange in the Absence of Shared Vocabularies

simDNA – 2. Iteration

simDNAType

/x*y*z*w*s*1*2*3*t*n*(o)*[a]*/

x – the current Topic is typing a Topic

y – the current Topic is typing an Association

z – the current Topic is typing a Topic Characteristics

w – the current Topic is typing a Association Role

s – the current Topic is scoping a Topic Characteristic

1 – the current Topic has a Source Locator

2 – the current Topic has a Subject Locator

3 – the current Topic has a Subject Identifier

t – the current Topic is typed

n – the current Topic has a TopicName

o – the current Topic has an Occurrence

o => /(v|l)t?s*/ (OccDNAtype)

a – the current Topic takes part in an Association

a => /a(tp)*/ (AssDNAtype)

Example

simDNAtype(T1) = x13tn

x – the current Topic is typing a Topic

1 – the current Topic has a Source Locator

2 – the current Topic has a Subject Locator

3 – the current Topic has a Subject Identifier

t – the current Topic is typed

n – the current Topic has a Topic Name

simDNA(T1,T2) = 01XX1

T2 types an Association

T2 has a Source Locator

T2 has none Subject Identifier

T2 is not typed

T2 has a Topic Name, which is not similar

simDNA(T1,T3) = 21133

T2 types a Topic

T2 has a Source Locator

T2 has a Subject Identifier

T2 is typed, and the typing Topic is similar

T2 has a Topic Name, which is a “bit” similar

p. 15

Lu

tz M

aic

her

(maic

her@

info

rmati

k.u

ni-

leip

zig

.de)

Topic Maps Exchange in the Absence of Shared Vocabularies

SIM - Example

13n Beispiel.xtm#TMStandards z13n X111 t_source.xtm#t_source Similar: false

xx1n Beispiel.xtm#t_person z13n 01X1 t_source.xtm#t_source Similar: false

z1n Beispiel.xtm#t_introduction z13n 21X1 t_source.xtm#t_source Similar: false

zz1n Beispiel.xtm#t_homepage z13n 21X1 t_source.xtm#t_source Similar: false

s1n Beispiel.xtm#t_en z13n X1X1 t_source.xtm#t_source Similar: false

s1n Beispiel.xtm#t_de z13n X1X1 t_source.xtm#t_source Similar: false

x1n Beispiel.xtm#t_requirements z13n 01X1 t_source.xtm#t_source Similar: false

ss1n Beispiel.xtm#t_nickname z13n X1X1 t_source.xtm#t_source Similar: false

13n Beispiel.xtm#t_sort z13n X111 t_source.xtm#t_source Similar: false

z1n Beispiel.xtm#t_source z13n 21X3 t_source.xtm#t_source Similar: true

y1nnn Beispiel.xtm#at_authorship z13n 01X1 t_source.xtm#t_source Similar: false

ws1n Beispiel.xtm#art_author z13n 01X1 t_source.xtm#t_source Similar: false

ws1n Beispiel.xtm#art_document z13n 01X1 t_source.xtm#t_source Similar: false

13tnn(vs)(lt)(vts)[atptp] Beispiel.xtm#M1 z13n X111 t_source.xtm#t_source Similar: false

13tnn(lt) Beispiel.xtm#M2 z13n X111 t_source.xtm#t_source Similar: false

12tn(lt)[atptp] Beispiel.xtm#RA1 z13n X1X1 t_source.xtm#t_source Similar: false

p. 16

Lu

tz M

aic

her

(maic

her@

info

rmati

k.u

ni-

leip

zig

.de)

Topic Maps Exchange in the Absence of Shared Vocabularies

SIM - Assessment● Self-Assessment

– take each Topic from the Topic Map

– create a (randomly pruned) fragment around the Topic Maps, and

– request the Topic Map.

– pruning probabilities● probType - of the Type of the Topics● probTopNam - of the whole Topic Name● probAss - of the Association the Topic plays a role

● probOcc - of a occurrence (and all of its properties)

p. 17

Lu

tz M

aic

her

(maic

her@

info

rmati

k.u

ni-

leip

zig

.de)

Topic Maps Exchange in the Absence of Shared Vocabularies

SIM - (Self-)Assessment

p. 18

Lu

tz M

aic

her

(maic

her@

info

rmati

k.u

ni-

leip

zig

.de)

Topic Maps Exchange in the Absence of Shared Vocabularies

Besides the TMDM Subject Equality Approach

Syntax Data Model(Graph)X

X

Referential Subject Equality

Structuralist Subject Equality

semantics as relative value

semantics as absolute value

bound to SM ontology

- simpleSIM- yields very good results in restricted domains- usage of Topic is ignored

bound to TMV vocabulary

bound to SMD ontology

- SIM (bound to TMDM)- more generic, yields good results - usage of Topic is exploited

bound to TMRM - adoption of Melniks Similarity Flooding Approach- not suitable for the usage scenario, but for SM ontology matching

bound to TMA ontology

- work to do

O(n*n)O(n*log(n))

Sowa’s Knowledge Signature

Subject Equality SMDi (

Subject Indication SMD1,

Subject Indication SMD2,

Subject Map Subject Proxy1, Subject Map Subject Proxy2)

true | false

How can a SMDSIM be defined: How a deterministic Subject IndicationApproach can be defined?

p. 19

Lu

tz M

aic

her

(maic

her@

info

rmati

k.u

ni-

leip

zig

.de)

Topic Maps Exchange in the Absence of Shared Vocabularies

Discussion