20
On the Use of Linked Open Data for Trusting Web Data Davide Ceolin and Valentina Maccatrozzo VU University Amsterdam

On the Use of Linked Open Data for Trusting Web Data · 2019-06-14 · Trusting Web data using subjective logic • We adopt supervised learning algorithms (when a trainingset is

  • Upload
    others

  • View
    1

  • Download
    0

Embed Size (px)

Citation preview

Page 1: On the Use of Linked Open Data for Trusting Web Data · 2019-06-14 · Trusting Web data using subjective logic • We adopt supervised learning algorithms (when a trainingset is

On the Use of Linked Open Data for Trusting

Web Data

Davide Ceolin and Valentina MaccatrozzoVU University Amsterdam

Page 2: On the Use of Linked Open Data for Trusting Web Data · 2019-06-14 · Trusting Web data using subjective logic • We adopt supervised learning algorithms (when a trainingset is

Outline

• Premises

• Introduction

• A Natural History Case Study

• A Cultural Heritage Case Study

• Future directions

• Recap, bibliography, etc.

Page 3: On the Use of Linked Open Data for Trusting Web Data · 2019-06-14 · Trusting Web data using subjective logic • We adopt supervised learning algorithms (when a trainingset is

Premises

• Trust ≈ Reliability.

• We make no assumption about the intentions of the data creator.

• This presentation gives a reflection on past work (see refs. ) and outlines future directions.

Page 4: On the Use of Linked Open Data for Trusting Web Data · 2019-06-14 · Trusting Web data using subjective logic • We adopt supervised learning algorithms (when a trainingset is

Introduction

• Trust Management: subjective logic (Jøsang, 2001)

• Extends boolean and probabilistic logic.

• Reasoning on “opinions” about propositions based on evidence.

• Accounts for source and uncertainty (inversely proportional to size of evidence set).

Page 5: On the Use of Linked Open Data for Trusting Web Data · 2019-06-14 · Trusting Web data using subjective logic • We adopt supervised learning algorithms (when a trainingset is

Subjective logic: basics

ωproposition = (b, d, u)source

• b + d + u = 1

• b ≈ p(proposition)

• u inversely proportional to evidence set.

• operators: boolean, discounting, fusion...

Page 6: On the Use of Linked Open Data for Trusting Web Data · 2019-06-14 · Trusting Web data using subjective logic • We adopt supervised learning algorithms (when a trainingset is

Trusting Web data using subjective logic

• We adopt supervised learning algorithms (when a trainingset is available).

• Trust is subjective.

• We look for different “opinions” about the data. Subjective logic allows us to handle them.

• First we estimate the data trustworthiness, then we select the “best” data (based, e.g. on author reputation).

Page 7: On the Use of Linked Open Data for Trusting Web Data · 2019-06-14 · Trusting Web data using subjective logic • We adopt supervised learning algorithms (when a trainingset is

Using LOD to assist evidential reasoning

• LOD provide lots of useful data.

• More evidence.

• Subjective logic’s distributions ≈ (At least some) LOD datasets distributions (Ceolin et al., 2011).

Page 8: On the Use of Linked Open Data for Trusting Web Data · 2019-06-14 · Trusting Web data using subjective logic • We adopt supervised learning algorithms (when a trainingset is

Photo: flickr.com/clumsyjim

Museums...

Page 9: On the Use of Linked Open Data for Trusting Web Data · 2019-06-14 · Trusting Web data using subjective logic • We adopt supervised learning algorithms (when a trainingset is

Photo: flickr.com/clumsyjim

Museums...

...have a problem.

Photo: flickr.com/grrrl

Page 10: On the Use of Linked Open Data for Trusting Web Data · 2019-06-14 · Trusting Web data using subjective logic • We adopt supervised learning algorithms (when a trainingset is

Photo: flickr.com/anirudhkoul

So they recruit some help...

Page 11: On the Use of Linked Open Data for Trusting Web Data · 2019-06-14 · Trusting Web data using subjective logic • We adopt supervised learning algorithms (when a trainingset is

Trusting Museum Annotations

• Museums manage large collections.

• Several Museums crowdsource annotations.

• The quality and accuracy of annotations is crucial for their business.

• Can they trust crowdsourced annotations?

Page 12: On the Use of Linked Open Data for Trusting Web Data · 2019-06-14 · Trusting Web data using subjective logic • We adopt supervised learning algorithms (when a trainingset is

A Natural History Case Study

specimen1 user1 aves xx ✓

specimen 2 user2 aves xyz ✗

specimen 3 user1 aves xz1 ✓

specimen 4 user2 aves yy ✓

specimen 5 user3 aves zz ✗

Page 13: On the Use of Linked Open Data for Trusting Web Data · 2019-06-14 · Trusting Web data using subjective logic • We adopt supervised learning algorithms (when a trainingset is

A Natural History Case Study

specimen1 user1 aves xx ✓

specimen 2 user2 aves xyz ✗

specimen 3 user1 aves xz1 ✓

specimen 4 user2 aves yy ✓

specimen 5 user3 aves zz ✗

Page 14: On the Use of Linked Open Data for Trusting Web Data · 2019-06-14 · Trusting Web data using subjective logic • We adopt supervised learning algorithms (when a trainingset is

A Natural History Case Study

specimen1 user1 aves xx ✓

specimen 2 user2 aves xyz ✗

specimen 3 user1 aves xz1 ✓

specimen 4 user2 aves yy ✓

specimen 5 user3 aves zz ✗

tax author1

tax author2

tax author1

tax author1

tax author2

Page 15: On the Use of Linked Open Data for Trusting Web Data · 2019-06-14 · Trusting Web data using subjective logic • We adopt supervised learning algorithms (when a trainingset is

A Natural History Case Study

specimen1 user1 aves xx ✓

specimen 2 user2 aves xyz ✗

specimen 3 user1 aves xz1 ✓

specimen 4 user2 aves yy ✓

specimen 5 user3 aves zz ✗

tax author1

tax author2

tax author1

tax author1

tax author2

Increased accuracy,from 53% to 82%

on a museum datasetCeolin et al., 2010

Page 16: On the Use of Linked Open Data for Trusting Web Data · 2019-06-14 · Trusting Web data using subjective logic • We adopt supervised learning algorithms (when a trainingset is

Another Case StudySemantic similarity for weighing evidence.

ExpertiseTraining set

New annotationTulip

Flower

Red

Pink

Purple

Semantic similarity

Rose

Up to: 84% accuracy, 88% precision, 96% Recall on two museum datasets (Ceolin et al. 2013a)

Page 17: On the Use of Linked Open Data for Trusting Web Data · 2019-06-14 · Trusting Web data using subjective logic • We adopt supervised learning algorithms (when a trainingset is

Future work

• We used similar methods (plus other statistical techniques) for analzying the reliability of UK Police Open Data (Ceolin et al., 2013b).

• We plan to extend them with LOD, e.g. for:

• geodisambiguation;

• crime type hierarchies.

Page 18: On the Use of Linked Open Data for Trusting Web Data · 2019-06-14 · Trusting Web data using subjective logic • We adopt supervised learning algorithms (when a trainingset is

Recap• LOD + Evidential reasoning (subjective logic) is a

powerful combination for trust (reliability) estimation

• enrichment;

• weighing.

• The more the better, but:

• evidence quality counts;

• data needs to be tracked (W3C PROV) and properly managed.

Page 19: On the Use of Linked Open Data for Trusting Web Data · 2019-06-14 · Trusting Web data using subjective logic • We adopt supervised learning algorithms (when a trainingset is

Bibliography• Jøsang, A., A logic for uncertain probabilities. International Journal of Uncertainty,

Fuzziness and Knowledge-Based Systems, 9(3), pp. 279-311, 2001

• Ceolin, D. van Hage, W. R. Fokkink, W. A Trust Model to Estimate Quality of Annotations using the Web. In WebSci, Web Science Repository, 2010.

• Ceolin, D., van Hage, W.R., Fokkink, W., Schreiber, G. Estimating Uncertainty of Categorical Web Data. In URSW, CEUR-ws.org, 2011.

• Ceolin, D. Nottamkandath, A. Fokkink, W., Semi-automated Assessment of Annotations Trustworthiness. In PST Conference, IEEE, 2013

• Ceolin, D. Moreau, L. O'Hara, K. Schreiber, G. Sackley, A. Fokkink, W. van Hage, W.R. Shadbolt, N., Reliability Analyses of Open Government Data. In URSW, CEUR-ws.org, 2013

• Ceolin, D. Nottamkandath, A. Fokkink, W. Efficient Semi-automated Assessment of Annotations Trustworthiness In Journal of Trust Management, Springer. (Accepted, 2014)