22
www.moving-project.eu Mohammad Abdel-Qader 1,2 , Iacopo Vagliano 1 , and Ansgar Scherp 3 1 ZBW – Leibniz Information Centre for Economics 2 Christian-Albrechts-University in Kiel, Germany 3 University of Essex, United Kingdom Analyzing the Evolution of Linked Vocabularies June 14 th , 2019, 19 th ICWE 2019 11.06.2019- 14.06.2019, Daejeon, Korea

Analyzing the Evolution of Linked Vocabulariesweb.geni-pco.com/icwe2019/Analyzing_the_Evolution_of_Linked_Vocabularies.pdfAnalyzing the Evolution of Linked Vocabularies June 14th th,

  • Upload
    others

  • View
    5

  • Download
    0

Embed Size (px)

Citation preview

Page 1: Analyzing the Evolution of Linked Vocabulariesweb.geni-pco.com/icwe2019/Analyzing_the_Evolution_of_Linked_Vocabularies.pdfAnalyzing the Evolution of Linked Vocabularies June 14th th,

www.moving-project.eu

Mohammad Abdel-Qader1,2, Iacopo Vagliano1, and Ansgar Scherp3 1 ZBW – Leibniz Information Centre for Economics

2 Christian-Albrechts-University in Kiel, Germany 3 University of Essex, United Kingdom

Analyzing the Evolution of Linked Vocabularies

June 14th , 2019, 19th ICWE 2019 11.06.2019- 14.06.2019, Daejeon, Korea

Page 2: Analyzing the Evolution of Linked Vocabulariesweb.geni-pco.com/icwe2019/Analyzing_the_Evolution_of_Linked_Vocabularies.pdfAnalyzing the Evolution of Linked Vocabularies June 14th th,

2 of 20

Motivation example 1

Analyzing the Evolution of Linked Vocabularies

schema

voaf

gn

loc

bevon

dcat

igeo

adms

food

oslo person lib

search

idemo

void

Page 3: Analyzing the Evolution of Linked Vocabulariesweb.geni-pco.com/icwe2019/Analyzing_the_Evolution_of_Linked_Vocabularies.pdfAnalyzing the Evolution of Linked Vocabularies June 14th th,

3 of 20

Motivation example 2

Analyzing the Evolution of Linked Vocabularies

22.05.12 25.06.12

22.11.12 27.04.13

24.05.13

24.09.13

16.09.13 22.07.15

V2 V3 V4

V2 V1 V3 V4 V6 V5

21.12.13

adms:SemanticAsset

adms:accessURL

adms:SemanticAsset

adms:accessURL

Need to be updated

12 created

terms

5 modified

terms

57 created terms

7 modified terms

23.02.13

V1

food

adms

Page 4: Analyzing the Evolution of Linked Vocabulariesweb.geni-pco.com/icwe2019/Analyzing_the_Evolution_of_Linked_Vocabularies.pdfAnalyzing the Evolution of Linked Vocabularies June 14th th,

4 of 20

Research Questions

• RQ1: What is the state of the Network of Linked Vocabularies? • Size? • Density, average degree? • Central nodes?

• RQ2: How are vocabulary terms reused by other vocabularies?

• How many? • Most recent ones? • Change impact?

• RQ3: How do ranking metrics, such as PageRank, HITS, and

Centrality, change during the evolution of the Network of Linked Vocabularies? • Important nodes? • Reuse over time?

Analyzing the Evolution of Linked Vocabularies

Page 5: Analyzing the Evolution of Linked Vocabulariesweb.geni-pco.com/icwe2019/Analyzing_the_Evolution_of_Linked_Vocabularies.pdfAnalyzing the Evolution of Linked Vocabularies June 14th th,

5 of 20

Analysis Methodology- Step1

Analyzing the Evolution of Linked Vocabularies

LOV- Linked Open Vocabulary1

636 vocabularies until June 2018

Extract

terms

Own terms

……

.......

……

……

Imported terms

……

.......

……

……

1http://lov.okfn.org/dataset/lov/

Page 6: Analyzing the Evolution of Linked Vocabulariesweb.geni-pco.com/icwe2019/Analyzing_the_Evolution_of_Linked_Vocabularies.pdfAnalyzing the Evolution of Linked Vocabularies June 14th th,

6 of 20

Analysis Methodology- Step2

Analyzing the Evolution of Linked Vocabularies

Density

in-/out degree

Average degree

Centrality

Network diameter

PageRank HITS

Page 7: Analyzing the Evolution of Linked Vocabulariesweb.geni-pco.com/icwe2019/Analyzing_the_Evolution_of_Linked_Vocabularies.pdfAnalyzing the Evolution of Linked Vocabularies June 14th th,

7 of 20

Analysis Methodology- Step3

Analyzing the Evolution of Linked Vocabularies

Imported terms

term1

term2

term3

term4

term5

term6

……

……

Most recent terms

Outdated terms

Page 8: Analyzing the Evolution of Linked Vocabulariesweb.geni-pco.com/icwe2019/Analyzing_the_Evolution_of_Linked_Vocabularies.pdfAnalyzing the Evolution of Linked Vocabularies June 14th th,

8 of 20

Analysis Methodology- Step4

Analyzing the Evolution of Linked Vocabularies

2001 2018

Repeat the methodology

Page 9: Analyzing the Evolution of Linked Vocabulariesweb.geni-pco.com/icwe2019/Analyzing_the_Evolution_of_Linked_Vocabularies.pdfAnalyzing the Evolution of Linked Vocabularies June 14th th,

9 of 20

Results-RQ1 State of NeLO 2018

Analyzing the Evolution of Linked Vocabularies

Measure Value

Nodes 994

Edges 7046

Diameter 12

Density 0.007

Degree 7.089

The full results and scalable version of figures are available at: https://sites.google.com/view/nelo-evolution

Page 10: Analyzing the Evolution of Linked Vocabulariesweb.geni-pco.com/icwe2019/Analyzing_the_Evolution_of_Linked_Vocabularies.pdfAnalyzing the Evolution of Linked Vocabularies June 14th th,

10 of 20

Results-RQ1 State of NeLO 2018

Analyzing the Evolution of Linked Vocabularies

Vocabulary Authority Hub

dcterms 0.305421 0.037978

dce 0.242374 0.037727

foaf 0.234664 0.044112

vann 0.184754 0.045030

skos 0.171827 0.034529

cc 0.113386 0.034723

vs 0.081972 0.040256

voaf 0.080739 0.045920

dctype 0.058152 0.037727

schema.org 0.046659 0.040364

Vocabulary PageRank

dce 0.045954

dcterms 0.027649

skos 0.017678

foaf 0.013986

dcam 0.009152

vann 0.009117

grddl 0.008740

dctype 0.005744

cc 0.005446

vs 0.005005

Top-10 vocabularies for HITS (Hub and Authority) scores in 2018

Top-10 vocabularies for PageRank scores in 2018

Page 11: Analyzing the Evolution of Linked Vocabulariesweb.geni-pco.com/icwe2019/Analyzing_the_Evolution_of_Linked_Vocabularies.pdfAnalyzing the Evolution of Linked Vocabularies June 14th th,

11 of 20

Results-RQ2 Reuse of vocabulary terms

Analyzing the Evolution of Linked Vocabularies

Term Importing vocabularies.

dcterms:modified 281

dcterms:title 276

dce:title 266

dce:creator 263

vann:preferredNamespacePrefix 257

dcterms:description 249

vann:preferredNamespaceUri 241

foaf:Person 175

foaf:name 164

cc:license 122

Top-10 terms that are reused by other vocabularies in 2018.

Page 12: Analyzing the Evolution of Linked Vocabulariesweb.geni-pco.com/icwe2019/Analyzing_the_Evolution_of_Linked_Vocabularies.pdfAnalyzing the Evolution of Linked Vocabularies June 14th th,

12 of 20

Results-RQ2 Reuse of vocabulary terms

Analyzing the Evolution of Linked Vocabularies

0

2

4

6

8

10

12

14

16

1 2 5 10 15 20

Nu

mb

er o

f v

oca

bu

lari

es

Outdated terms imported

The number of vocabularies that reuse outdated terms by the number of outdated terms reused

Page 13: Analyzing the Evolution of Linked Vocabulariesweb.geni-pco.com/icwe2019/Analyzing_the_Evolution_of_Linked_Vocabularies.pdfAnalyzing the Evolution of Linked Vocabularies June 14th th,

13 of 20

Results-RQ2 Reuse of vocabulary terms

Analyzing the Evolution of Linked Vocabularies

123 124

457

828

1,345 1,698

2,435

3,884 5,034

6,816

13,047

19,003

27,157

40,378

52,176

58,059

59,613

62,331

72 68 85

480 555 701 786 877

1,159 1,480

1,993 2,328 2,982

3,781 5,162 5,539 5,786

9,750

1

10

100

1,000

10,000

100,000

Nu

mb

er o

f te

rms

Existing terms

Imported terms

The number of existing terms and the reused terms from other vocabularies

Page 14: Analyzing the Evolution of Linked Vocabulariesweb.geni-pco.com/icwe2019/Analyzing_the_Evolution_of_Linked_Vocabularies.pdfAnalyzing the Evolution of Linked Vocabularies June 14th th,

14 of 20

Results-RQ3 NeLO evolution

Analyzing the Evolution of Linked Vocabularies

11 11

20

44

62 80

113

164 225

297

430 536

639 734

851 926 958 994

30 30

66

155

235 316

468

713

996 1373

2389 3166

4168 4940

5769 6407 6731 7047

1

10

100

1,000

10,000

Val

ue

Node

Edges

Number of nodes and edges in NeLOs

Page 15: Analyzing the Evolution of Linked Vocabulariesweb.geni-pco.com/icwe2019/Analyzing_the_Evolution_of_Linked_Vocabularies.pdfAnalyzing the Evolution of Linked Vocabularies June 14th th,

15 of 20

Results-RQ3 NeLO evolution

Analyzing the Evolution of Linked Vocabularies

2005 2008

2011 2017

The full results and scalable version of figures are available at: https://sites.google.com/view/nelo-evolution

Page 16: Analyzing the Evolution of Linked Vocabulariesweb.geni-pco.com/icwe2019/Analyzing_the_Evolution_of_Linked_Vocabularies.pdfAnalyzing the Evolution of Linked Vocabularies June 14th th,

16 of 20

Results-RQ3 NeLO evolution (In/out degree)

Analyzing the Evolution of Linked Vocabularies

0

10

20

30

40

50

2008 2009 2010 2011 2012 2013 2014 2015 2016 2017 2018

In-d

egre

e sc

ore

semiointervaloamo

0

50

100

150

200

250

300

Ou

t-d

egre

e sc

ore

vannskosccvs

Page 17: Analyzing the Evolution of Linked Vocabulariesweb.geni-pco.com/icwe2019/Analyzing_the_Evolution_of_Linked_Vocabularies.pdfAnalyzing the Evolution of Linked Vocabularies June 14th th,

17 of 20

Results-RQ3 NeLO evolution

Analyzing the Evolution of Linked Vocabularies

0

0.005

0.01

0.015

0.02

0.025

0.03

0.035

0.04

0.045

Pag

eRan

k s

core

Snapshot year

skos

dcam

vann

grddl

dctype

The PageRank scores for the top-five vocabularies for each year

Page 18: Analyzing the Evolution of Linked Vocabulariesweb.geni-pco.com/icwe2019/Analyzing_the_Evolution_of_Linked_Vocabularies.pdfAnalyzing the Evolution of Linked Vocabularies June 14th th,

18 of 20

Results-RQ3 NeLO evolution (HITS)

Analyzing the Evolution of Linked Vocabularies

0

0.05

0.1

0.15

0.2

0.25

2003 2004 2005 2006 2007 2008 2009 2010 2011 2012 2013 2014 2015 2016 2017 2018

Hu

b s

core

vann

skos

cc

0

0.05

0.1

0.15

0.2

2003 2004 2005 2006 2007 2008 2009 2010 2011 2012 2013 2014 2015 2016 2017 2018

Au

tho

rity

sco

re

vannskosccvs

Page 19: Analyzing the Evolution of Linked Vocabulariesweb.geni-pco.com/icwe2019/Analyzing_the_Evolution_of_Linked_Vocabularies.pdfAnalyzing the Evolution of Linked Vocabularies June 14th th,

19 of 20

Conclusions

• From NeLO’s 2018 density and average degree, we think that there is a need to increase the imports/exports relations between vocabularies.

• Only 16% of the existing terms are reused by the other

vocabularies. • Using recommending systems can help to increase the reuse amount

between vocabularies in order to avoid overlap and redundancy.

• The process of checking for the changes on other vocabularies is usually done manually.

• Many vocabularies are up-to-date in NeLO 2018. • 33 vocabularies are still using outdated terms.

Analyzing the Evolution of Linked Vocabularies

Page 20: Analyzing the Evolution of Linked Vocabulariesweb.geni-pco.com/icwe2019/Analyzing_the_Evolution_of_Linked_Vocabularies.pdfAnalyzing the Evolution of Linked Vocabularies June 14th th,

20 of 20

Conclusions

• There is a lack of tools to notify ontology engineers about changes in the vocabularies.

• The dynamics of changes has slowed down after some fast

evolution between 2001 and 2010.

• Based on the results, the reusing of terms was initially more common.

• Changes in the vocabularies with high PageRank and HITS scores affects many other vocabularies.

Analyzing the Evolution of Linked Vocabularies

More results and scalable version of figures are available at: https://sites.google.com/view/nelo-evolution

Page 21: Analyzing the Evolution of Linked Vocabulariesweb.geni-pco.com/icwe2019/Analyzing_the_Evolution_of_Linked_Vocabularies.pdfAnalyzing the Evolution of Linked Vocabularies June 14th th,

21 of 20

TraininG towards a society of data-saVvy inforMation prOfessionals to enable open leadership INnovation

This work is supported by…

Linked Open Citation Database LOC-DB

AND

Page 22: Analyzing the Evolution of Linked Vocabulariesweb.geni-pco.com/icwe2019/Analyzing_the_Evolution_of_Linked_Vocabularies.pdfAnalyzing the Evolution of Linked Vocabularies June 14th th,

22 of 20

References

• Abdel-Qader, M., Scherp, A., Vagliano, I.: Analyzing the evolution of vocabulary terms and their

impact on the LOD Cloud. In: ESWC. Springer (2018)

• Cardoso, S.D., Pruski, C., Da Silveira, M., Lin, Y.C., Gro, A., Rahm, E., Reynaud- Delaitre, C.:

Leveraging the impact of ontology evolution on semantic annotations. In: European Knowledge

Acquisition Workshop. pp. 68-82. Springer (2016)

• Dividino, R., Gottron, T., Scherp, A.: Strategies for efficiently keeping local linked open data caches

up-to-date. In: ISWC. pp. 356-373. Springer (2015)

• Dos Reis, J.C., Pruski, C., Da Silveira, M., Reynaud-Delaitre, C.: Understanding semantic mapping

evolution by observing changes in biomedical ontologies. Journal of biomedical informatics 47, 71-

82 (2014)

• Ghazvinian, A., Noy, N.F., Jonquet, C., Shah, N., Musen, M.A.: What four million mappings can tell

you about two hundred ontologies. In: ISWC. pp. 229-242. Springer (2009)

• Gottron, T., Gottron, C.: Perplexity of index models over evolving linked data. In: ESWC. pp. 161-75.

Springer (2014)

• Käfer, T., Abdelrahman, A., Umbrich, J., O'Byrne, P., Hogan, A.: Observing linked data dynamics. In:

ESWC. pp. 213{227. Springer (2013)

Analyzing the Evolution of Linked Vocabularies