21
1/21 Research Seminar. Silvia Puglisi Departament d'Enginyeria Telemàtica Silvia Puglisi [email protected] “Research Seminar” Master in Telematics Engineering-UPC On Content-Based Recommendation and Users Privacy in Social Tagging Systems Silvia Puglisi Barcelona, UPC, 2013

Resource recommendation vs privacy enhancement

Embed Size (px)

DESCRIPTION

Social tagging has opened new possibilities for applications interoperability on the semantic web, while at the same time posing new privacy treats. Recommendation and information filtering systems in fact predict users preferences, providing personalized content to their users, but also exposing their profiles to possible privacy attacks. Tag suppression and forgery are Privacy Enhancing Techniques that protect users privacy to a certain extent, at the loss of semantic accuracy loss, or in other words privacy gain at the expenses of utility loss. The impact of tag suppression and forgery to content-based recommendation is hence investigated in a real world application scenario.

Citation preview

Page 1: Resource recommendation vs privacy enhancement

1/21Research Seminar. Silvia Puglisi

Departament d'Enginyeria Telemàtica

Silvia [email protected]

“Research Seminar”Master in Telematics Engineering-UPC

On Content-Based Recommendation and Users Privacy in Social Tagging SystemsSilvia Puglisi

Barcelona, UPC, 2013

Page 2: Resource recommendation vs privacy enhancement

2/21Research Seminar. Silvia Puglisi

Departament d'Enginyeria Telemàtica

Social tagging is the activity that allows users to assign keywords (tags) to web based resources.

What is social tagging?

Page 3: Resource recommendation vs privacy enhancement

3/21Research Seminar. Silvia Puglisi

Departament d'Enginyeria Telemàtica

Tagging and tags

Tag: a label attached to someone or something for identification or other information

Page 4: Resource recommendation vs privacy enhancement

4/21Research Seminar. Silvia Puglisi

Departament d'Enginyeria Telemàtica

Scenario

Social tagging enables semantic interoperability in web applications.

Recommendation and information filtering systems have been developed to predict users preferences.

Users hence reveal their personal preferences on social tagging platforms.

Privacy enhancing techniques (PET) have been developed to protect user privacy to a certain extent, at the expense of semantic loss.

Page 5: Resource recommendation vs privacy enhancement

5/21Research Seminar. Silvia Puglisi

Departament d'Enginyeria Telemàtica

Objective

Using as starting point research done in the field of recommendations systems [1] and PET [2].

The objective of this study is evaluate the impact of two PET, tag forgery and suppression, on the performance of a recommendation system, on real world application data.

[1] Bellogín, Alejandro, Iván Cantador, and Pablo Castells. "A comparative study of heterogeneous item recommendations in social systems." Information Sciences (2012)

[2] Parra-Arnau, Javier, David Rebollo-Monedero, and Jordi Forné. "A privacy-protecting architecture for collaborative filtering via forgery and suppression of ratings." Data Privacy Management and Autonomous Spontaneus Security (2012): 42-57.

Page 6: Resource recommendation vs privacy enhancement

6/21Research Seminar. Silvia Puglisi

Departament d'Enginyeria Telemàtica

Dataset

Considering different social bookmarking platform, Delicious was identified as a representative system of an application rich in collaborative tagging information.

Delicious is a social bookmarking platform for web resources.

The dataset containing Delicious data was obtained from the ones publicly available at the 2nd International Workshop on Information Heterogeneity and Fusion in Recommender Systems.

Page 7: Resource recommendation vs privacy enhancement

7/21Research Seminar. Silvia Puglisi

Departament d'Enginyeria Telemàtica

Delicious

Page 8: Resource recommendation vs privacy enhancement

8/21Research Seminar. Silvia Puglisi

Departament d'Enginyeria Telemàtica

TechniquesModelling the User/Item Profile

The simplest approach to model users and items is to count the number of times a tag has been used:

•By a user to annotate different items in the same category.

•Or by the community to annotate the item.

The user/item profile is then described as a histogram of the relative frequencies of tags within a predefined set of categories of interest.

Page 9: Resource recommendation vs privacy enhancement

9/21Research Seminar. Silvia Puglisi

Departament d'Enginyeria Telemàtica

TechniquesHistogram of a user profile

Page 10: Resource recommendation vs privacy enhancement

10/21Research Seminar. Silvia Puglisi

Departament d'Enginyeria Telemàtica

Techniques Privacy Metric

The Kullback-Leibler (KL) divergence has been adopted as privacy criteria, following the perspective of Jaynes’ rationale on entropy maximization methods.

Since the KL divergence may be regarded as a generalization of entropy of a distribution, relative to another, it is often referred to as relative entropy.

Page 11: Resource recommendation vs privacy enhancement

11/21Research Seminar. Silvia Puglisi

Departament d'Enginyeria Telemàtica

TechniquesUtility Metric

A measure of how an item is useful for a certain user is needed.

We could convey that an item is useful if its profile is somehow similar to the user profile.

Hence we need a measure of similarity.

Content based recommender models are defined as similarity measures between users and item profiles. This is provided by the cosine-based similarity measure:

Page 12: Resource recommendation vs privacy enhancement

12/21Research Seminar. Silvia Puglisi

Departament d'Enginyeria Telemàtica

TechniquesPerformance Metric

The recommender system is evaluated considering a content retrieval scenario where a user is provided with a ranked list of N recommended items.

The performance metric adopted is hence among the commonly used for ranked list prediction, i.e. precision at top N.

In the field of Information Retrieval precision can be defined as the fraction of recommended items that are relevant for a target user.

Page 13: Resource recommendation vs privacy enhancement

13/21Research Seminar. Silvia Puglisi

Departament d'Enginyeria Telemàtica

Techniques Tag Forgery and Suppression

Tag suppression and forgery are privacy enhancing techniques that helps users who tags resources online, from revealing sensible information to a possible attacker.

Page 14: Resource recommendation vs privacy enhancement

14/21Research Seminar. Silvia Puglisi

Departament d'Enginyeria Telemàtica

Techniques Tag Forgery and Suppression Rates

The tag forgery rate represents the ratio of forged items:

The tag suppression rate, is the proportion of items that the user consents to eliminate:

Page 15: Resource recommendation vs privacy enhancement

15/21Research Seminar. Silvia Puglisi

Departament d'Enginyeria Telemàtica

Techniques The Privacy-Forgery-Suppression Function

Consistently the privacy-forgery-suppression function can be defined:

Page 16: Resource recommendation vs privacy enhancement

16/21Research Seminar. Silvia Puglisi

Departament d'Enginyeria Telemàtica

Evaluation

Page 17: Resource recommendation vs privacy enhancement

17/21Research Seminar. Silvia Puglisi

Departament d'Enginyeria Telemàtica

EvaluationStatistics about the dataset

Categories 11 Users 1867

Item-Category Tuples

98998 Avg. tags per user 477.75

Items 69226Avg. Items per Category

81044

Avg. categories per item

1.4 Tags per item 13.06

Page 18: Resource recommendation vs privacy enhancement

18/21Research Seminar. Silvia Puglisi

Departament d'Enginyeria Telemàtica

ResultsRelative Risk Reduction with forgery - Utility

Page 19: Resource recommendation vs privacy enhancement

19/21Research Seminar. Silvia Puglisi

Departament d'Enginyeria Telemàtica

ResultsRelative Risk Reduction with suppression - Utility

Page 20: Resource recommendation vs privacy enhancement

20/21Research Seminar. Silvia Puglisi

Departament d'Enginyeria Telemàtica

Conclusions

Tag suppression and forgery are simple privacy enhancing techniques able to protect users privacy at the cost of some semantic loss.

This study shows with a simple experimental evaluation, in a real world application scenario, how the performances degradation of a recommender system, is small if compared to the privacy risk reduction offered by the application of these techniques.

Page 21: Resource recommendation vs privacy enhancement

21/21Research Seminar. Silvia Puglisi

Departament d'Enginyeria Telemàtica

Thank you!