40
Can Peer-to-Peer File- sharing be of Help for Research Communities ? Julita Vassileva Computer Science Department (MADMUC Lab) University of Saskatchewan

Can Peer-to-Peer File - sharing be of Help for Research Communities ?

Embed Size (px)

DESCRIPTION

Can Peer-to-Peer File - sharing be of Help for Research Communities ?. Julita Vassileva Computer Science Department (MADMUC Lab) University of Saskatchewan. Outline. Motivation Problems: user participation, trust Motivating user participation User modelling Reward with better QofS - PowerPoint PPT Presentation

Citation preview

Can Peer-to-Peer File-sharing be of Help for Research

Communities ?

Julita Vassileva

Computer Science Department (MADMUC Lab)

University of Saskatchewan

2

Outline

• Motivation• Problems: user participation, trust• Motivating user participation

– User modelling– Reward with better QofS– Social awareness (visualization)

• Ensuring trust• Conclusions

3

Motivation

• Need a search engine for locally stored papers– Web –links disappear, protected sites– Hard disks too large

• Why P2P?– Harvest the resources of a

community of users – Advantages of a

distributed approach vs centralizedmaranGraphics Inc.

4

What is a P2P System?

GNUTELLA

5

COMTELLA

• A P2P (Gnutella based) system for file sharing and service– users share academic papers, code snippets

• Non-centralized digital library for a research group / class

• Can be downloaded from: http://bistrica.usask.ca/madmuc/news.htm

6

Christopher CoxNSERC Summer2002 project

Helen BretzkeCRA-W and NSERC Summer’2002 project

Lingling SunGraduate student

Yamini UpadrashtaGraduate student

Vassileva J. (2002) Supporting Peer-to-Peer User Communities, in R. Meersman, Z. Tari et al. (Eds.) "On the Move to Meaningful Internet Systems 2002: CoopIS, DOA, and ODBASE" Coordinated International Conferences Proceedings, Irvine, Springer LNCS 2519, 230-247.

Vassileva, J. (2002) Motivating Participation in Peer to Peer Communities,in P.Peta, R.Tolksdorf, F. Zambonelli (Eds.) Engineering Societies in theAgents World III, Proceedings of the 3rd International Workshop ESAW'02,Madrid, Springer LNAI 2577, 141-155.

Bretzke H., Vassileva J. (2003) Motivating Cooperation in Peer to Peer Networks, in P.Brusilovsky, A. Corbett, F.De Rosis (eds.) Proceedings of the 9th International Conference, on User Modelling, UM03, Johnstown, PA,Springer LNCS, 218-227.

7

8

Problems• User Participation

– “critical mass” needed– most users are free-riders– why do people contribute?

• satisfies a need (is useful)• doesn’t cost (effort, money, inconvenience)• there is some incentive (money, glory, power)• serves a greater cause (e.g. cancer research,

SETI@home, etc.)• Trust

– sure that contributing won’t cause harm – able to identify trustworthy peers

9

First condition: system must be useful

• Allow searching own files– Any file stored on disk can be found with Comtella – Shared files can be stored anywhere on disk

• Integration with other tools– With Browser (e.g. IE, Netscape, Mozilla, etc.)

• allows viewing files directly from Comtella• prompts the user to share papers when a PDF file opened

– With Word Processor (e.g. MS Word)• generating lists of references automatically

• Additional functionality– Adding annotations and ratings to papers

10

Levels of participation

• Bring new files

• Provide disk space / processor time

• Dispatch requests

• Stay on-line

• Use and quit

11

socially motivated

Why do people offer their time and resources? Different people have different motivations:

materialistic

Some are altruists

Some would help their friends and hope to make new friends through helping

Some seek glory

Some seek high marks

or money…

How to motivate participation?

altruistic

Some would expect better service utilitarian

12

Incentives

Micro-payments for each transaction?

Shirky says it won’t ever work (e.g. Mojo-nation): Flat rates work better (e.g. Internet, cable)

How to map virtual currency into real money?

13

socially motivated

Why do people offer their time and resources? Different people have different motivations:

materialistic

Some are altruists (for the cause)

Some would help their friends and hope to make new friends through helping

Some seek glory

Some seek high marks

or money…

How to motivate participation?

altruistic

Some would expect better service utilitarian

14

Know your user!

• User Type: Altruist? Socialist? Utilitarian?

• User Interests: What does she search / need?

• User Relationships and Community: Who shares interest with the user? Potential “friends” and “foes”.

Modelling

15

• Define a taxonomy of subject categories (e.g. ACM subject index)

• Keep track of the categories of queries ( user interests)

• Keep track of resources offered by the user in each interest category

• Update user level of interest in each sub-category using reinforcement learning

• Cluster users in interest-based groups

Modelling user interests

16

Computing user interests

• Reinforcement learningThe user’s strength of interest S in an area a is

calculated based on how frequently and how recently the user has searched in this area.

Sa(et, t) = i * Sa(e t-1, t-1) + (1 - i) * et where et [0, 1] is calculated as et = 1/ d, and

d = 1 + level_distance between the level of the sub-area of the query and the level of the area a in the

ontology hierarchy. Currently, the ontology hierarchy has only 2 levels, so et = 0.5

17

Modelling user relationships• Monitor whose files the user chooses, the quality of the

files (does the user keep the files), and who downloads files offered by the user

• Represent each user relationship:For each area of interest– Strength – how successful service was given

(reinforcement learning used, similar to user interests)

– Balance – reciprocity of services used/ given

• Adapt P2P topology – form a neighborhood for search using the best relationships (“friends”) in the area of search

Gnutella

18

Computing the balance of a relationship

• BXY = (N XY - N YX ) / (N XY + N YX ) BXY [-1, 1]

N XY - number of times X took from Y

N YX - number of times Y took from X

19

Modelling user type

• Monitor user’s actions regarding file sharing, relative time spent on-line, acts of interrupting service, total balance of user’s giving / taking

• Update a number in [-1, 1] representing user’s cooperativeness

• Motivational actions in the interface triggered by passing certain thresholds

20

Computing user type

• The measure of user cooperativeness at time t C(wt, t) = i * C(w t-1, t-1) + (1 - i) * wt, w [-1,0) (0,1] represent the weight of evidence, where w < 0 is a

selfish act while w > 0 is an altruistic act.

overallBalance = (1/n)*Y (BXY)

userType = (C(wt, t) + overallBalance) /2 If userType is in [-1, -0.5) then user is selfish, if it is in [-0.5) ( 0.5] then

user is reciprocal, and if it is in (0.5, 1] then user is altruistic.

21

Rewarding relationships• People who share a lot of useful files and behave

cooperatively will have more friends• Friends are treated differently

– Transfers not interrupted– Queries processed with priority– Queries are propagated farther

• Queries sent to friends in the area– Higher chance of having relevant files – Faster responses– Better quality of files

• People with more friends get better Quality of Service!

22

Evaluation results - simulation

Comparing the round trip time obtained for queries without a friends’ list with the round trip time for queries with friends’ list

0.00 100.00 200.00 300.00 400.00

1

2

3

Simulation Run

Round Trip Time (ms)

with list

without list

23

Evaluation results – user experiment

0.00

200.00

400.00

600.00

800.00

1000.00

1200.00

1400.00

0.0 1000000.0 2000000.0 3000000.0 4000000.0 5000000.0 6000000.0

Total Size of Files Shared in Bytes

Aver

age Q

uery

Elap

se Ti

me

8 usersover a week

24

• The simulation results show that peers obtain results faster when searching for files in categories for which they have friends

• The user evaluation still underwayDoes the QoS reward motivate participation?

Summary of results

25

Social awareness

In cities, the sidewalks provide the right kinds and numbers of interactions from which neighborhoods emerge.

In isolation, selfishness is logical

To gain perspective, users needfeedback about their social environment

26

A matter of scale

An astronomical metaphor

• Provides visual feedback

• Resolves scale

• Attractive & interesting

27

Views of the community• Connectivity (currently reachable peers)

• Ranking of peers by contribution

• number of shared files

• balance of relationships

• Papers shared by each peer

• Interests of each peer

28

Architecture

Server

• Collect info. from peers• Generate community views

Introducing a non-vital server or many servers

Server

Server

29

Ranking of peers based on contribution

30

Shared interests

31

Personalized views

• Who are my friends in this area?

• How strong is my relationship with them?

• How much have they contributed?

• Do I owe them or do they owe me?

• Which files do they share?

• What have they been searching for / downloading recently?

32

Trust

• We already model the strength of relationships between users– Based on counting # downloads /uploads– We can incorporate an explicit measure of

the quality of resource

• Idea: Let users:– Rate their resources (quality of paper)– Add annotations (summaries) of papers

33

Immediate benefit

• Learning effect: compiling reviews of articles

• Visualization of document ranking in given category of interest: “top 10 list”

Professor / Boss will know who has read and annotated paper and who has not could have a motivation effect on participation.

34

Reputation

• Global reputation of peers can be computed– Ranking of peers based on

• how many highly rated papers they share• how many times they have introduced a new paper in the

system that has become highly rated• how the users’ ratings correlate with those of their peers

and with high-rank peers

– Emergence of “Power peers”:• What extra rights will they have (reward)? • Could have a motivational effect, as in Slashdot.com

35

Community views• Connectivity (currently reachable peers)

• What are these peers interested in / sharing

• Ranking of peers by contribution

• Shared interest clusters

• Personalized views (who are my friends?)

• Ranking of resources (papers)

• Reputation of peers

36

Updating trust in peers

• Relationships subjective trust in the source of the paper (the other peer)– Trust depends on the evaluation criteria of the peer

• Compare own rating of paper with the rating given by the source

If ratings are sufficiently close, increase trust in source, else decrease trust

– Trust depends on category of interest• Combined trust measures for peers?

• Peers share their trust measures (gossip)

S

P

37

Trust and reputationYao Wang

Ph.D. student

Wang Y., Vassileva J. (to appear) Bayesian Network-Based Trust Model, Proc. of IEEE/ WIC International Conference on Web Intelligence (WI 2003), October 13-17, 2003, Halifax, Canada.

(best paper award nominee)

38

Applying a Bayesian network trust model to COMTELLA

T

File quality Paper category(subject area)

Paper ratingReliability

(download)

39

Future work

• Incorporating a trust & reputation mechanism into Comtella:– to protect from malicious file-sharers– to ensure that users share papers with

appropriate peers and benefit most from their articles and comments

40

“take-home” messages• Motivating user participation is crucial

• Building in mechanisms for trust and reputation

– Encouraging contribution building relationships– Rewards by better quality of service reputation / visibility– Techniques:

– Modeling user interests, relationships, user type– Creating community awareness through visualization

– Will allow users to find reputable sources – May protect community from malicious or irresponsible peers