View
474
Download
0
Category
Tags:
Preview:
DESCRIPTION
My slides at the Consuming Linked Data workshop (COLD2014) at ISWC2014
Citation preview
s
+
Walking Linked Data:a graph traversal approach to explain clusters Ilaria Tiddi, Mathieu d’Aquin, Enrico Motta
+Problem: explaining patterns
Data: women/men literacy rate from UNESCO [1]
In which countries are men more educated than women?
+Problem: explaining patterns
Data: women/men literacy rate from UNESCO [1]
In which countries are men more educated than women?
The yellow countries ( )
Education : Men Women Equal
+Problem: explaining patterns
Data: women/men literacy rate from UNESCO [1]
In which countries are men more educated than women?
Education : Men Women Equal
How do you know?
+Problem: explaining behaviors
We explain thanks to our own (background) knowledge.
Can we do the same with the knowledge from Linked Data?
+Linked Data contain explanations
but where?
:India
:UK
:Ethiopia
:US
:Somalia
+Linked Data contain explanations
but where?
:India
:UK
:Ethiopia
:US
:Somalia db:Somalia
db:Ethiopia
db:India
db:UK
db:US
sameAs
sameAs
sameAs
sameAs
sameAs
+Linked Data contain explanations
but where?
:India
:UK
:Ethiopia
:US
:Somalia db:Somalia
db:Ethiopia
db:India
db:UK
db:US
…
…
…
sameAs
dc:subject
dc:subject
dc:subject
sameAs
sameAs
sameAs
sameAs
dc:subject
dc:subject
+Linked Data contain explanations
but where?
:India
:UK
:Ethiopia
:US
:Somalia db:Somalia
db:Ethiopia
db:India
db:UK
db:US
…
…
…
db:Category:LeastDevelopedCountries
db:Category:LiberalCountries
sameAs
dc:subject
dc:subject
dc:subject
sameAs
sameAs
sameAs
sameAs
dc:subject
dc:subject
skos:relatedMatch
skos:relatedMatch
skos:relatedMatch
+Linked Data contain explanations
but where?
:India
:UK
:Ethiopia
:US
:Somalia db:Somalia
db:Ethiopia
db:India
db:UK
db:US
600/pp
3,800/pp
36,000/ppdbp:gdp
49,000/pp
1,200/pp
dbp:gdpsameAs
sameAs
sameAs
sameAs
sameAs
dbp:gdp
dbp:gdp
dbp:gdp
+Linked Data contain explanations
but where?
:India
:UK
:Ethiopia
:US
:Somalia db:Somalia
db:Ethiopia
db:India
db:UK
db:US
600/pp
3,800/pp
36,000/pp
3,800/pp
36,000/pp
≤
≥
49,000/pp ≥
1,200/pp≤
≤
sameAs
sameAs
sameAs
sameAs
sameAs
dbp:gdp
dbp:gdp
dbp:gdp
dbp:gdp
dbp:gdp
+Looking for explanations in graph
:India
:Ethiopia
:SomaliasameAs
4,000/pp
cat:LeastDeveloped
Countries
Given a graph of Linked Data where URI are nodes RDF properties are edges
sameAs
sameAs
…
…
…
…
…
…
…
…
dc:subject
dbp:gdp
dc:subject
dbp:gdp
dc:subject
dbp:gdp
skos:related
skos:related
≤
≤
≤
+Looking for explanations in graph
:India
:Ethiopia
:SomaliasameAs
4,000/pp
cat:LeastDeveloped
Countries
sameAs
sameAs
…
…
…
…
…
…
…
dc:subject
dbp:gdp
dc:subject
dbp:gdp
dc:subject
dbp:gdp
skos:related
skos:related
≤
≤
≤
…
Find the ending value most pointed by entities in the
cluster the best path in order to further expand the graph
+A* algorithm for Linked Data Best-first search algorithm
Given an initial node and a final node
find the least expensive path between them
Path cost function f(path) = actual cost g(path)+ future cost h(path)
Without knowledge of the graph
Search in the graph for the best path and explanation
The graph is iteratively build by URI dereferencing
No need to know the Linked Data graph a priori
+Dedalo: an A* process for Linked Data
Building graph(URI dereferencing)
Choosing thebest path
Finding thebest explanation
Iteratively building a Linked Data graph and looking for an explanation of the pattern
+Dedalo: an A* process for Linked Data
Dereference URIs through HTTP GET
take an entity
read its properties and values
add them to the graph
db:Ethiopia
db:Ethiopiadb:Category:AfricanCountries
dc:subject
1,200dbp:gdp
:India
:India
:India
:India
:India
db:Ethiopia
owl:sameAs
……
…
…
…
…
+Dedalo: an A* process for Linked Data
Dereference URIs through HTTP GET
take an entity
read its properties and values
add them to the graph
db:Ethiopia
db:Ethiopiadb:Category:AfricanCountries
dc:subject
1,200dbp:gdp
:India
:India
:India
:India
:India
db:Ethiopia
db:Category:AfricanCountriesdc:subject
1,200dbp:gdp
owl:sameAs
……
…
…
…
…
+Dedalo: an A* process for Linked DataCollect new paths (sequences of edges)
add the new property to the previous pathowl:sameAsdc:subject
owl:sameAsdbp:gdp
evaluate new paths with Entropy1
ent(owl:sameAsdc:subject)
ent(owl:sameAsdbp:gdp)
add to the pile of paths (the first one is chosen)owl:sameAsdc:subject
owl:sameAsdbp:gdp
owl:sameAs
[1] Tiddi et al., ESWC 2014
:India
:India
:India
:India
:India
…
…
…
……
…
…
…
…
…
…
+Dedalo: an A* process for Linked DataBuild explanations (path + final nodes)
Each of the values the new path points to e1= owl:sameAsdc:subject e2= owl:sameAsdc:subject
Compare numerical value if the property is a datatype e2= owl:sameAsdc:gdp ≥ e3= owl:sameAsdc:gdp ≤ 1,200
Rank explanations according to the
F-Measure
db:Category:SouthAsianCountries
1,200
initial URIs (countries)
URIs pointing to
URIs in
1,200
db:Category:AfricanCountries
+Dedalo: experiments
Countries where men are more educated than women
skos:exactMatchdbp:hdiRank ≥ 126 87.8% 197”
skos:exactMatchdc:subject db:Category:Least_Developed_Countries
74.7% 524’’
skos:exactMatchdbp:gdpPPPPerCapitaRank ≥ 89
68.3% 269”
Countries where women are more educated than men
skos:exactMatchdbp:hdiRank ≤ 119 63.4% 198”
skos:exactMatchdbp:gdpPPPPerCapitaRank ≤ 56
62.3% 236’’
Countries where education is equal
skos:exactMatchdbp:gdpPPPRank ≥ 64 62.0% 234”
skos:exactMatchdbp:gdpPPPPerCapitaRank ≥ 29
61.0% 268’’
+Conclusions and future work Dedalo, A* process to search explanation within Linked
Data From a pattern to explain Finds the path to the best explanation Using Entropy and F-Measure
Focusing on the bias introduced by incomplete data2
Combining atomic explanations3
Evaluating Dedalo on a large use case: Google Trends
[2, 3] Tiddi et al., EKAW 2014
s
+
Thanks! Questions?
Recommended