25
Search in Social Networks with Access Control Truls A. Bjørklund, Michaela Götz , Johannes Gehrke Norwegian University of Science and Technology, Cornell University 1 KEYS 2010, Michaela Götz

Search in Social Networks with Access Controlgoetz/pdf/SocialNetworkSearch.pdf · Search in Social Networks with Access Control Truls A. Bjørklund, Michaela Götz, Johannes Gehrke

  • Upload
    others

  • View
    2

  • Download
    0

Embed Size (px)

Citation preview

Page 1: Search in Social Networks with Access Controlgoetz/pdf/SocialNetworkSearch.pdf · Search in Social Networks with Access Control Truls A. Bjørklund, Michaela Götz, Johannes Gehrke

Search in Social Networks with Access Control

Truls A. Bjørklund, Michaela Götz, Johannes Gehrke Norwegian University of Science and Technology, Cornell University

1 KEYS 2010, Michaela Götz

Page 2: Search in Social Networks with Access Controlgoetz/pdf/SocialNetworkSearch.pdf · Search in Social Networks with Access Control Truls A. Bjørklund, Michaela Götz, Johannes Gehrke

Content Search in Social Networks

Where’s the Chopin video?

2 KEYS 2010, Michaela Götz

Page 3: Search in Social Networks with Access Controlgoetz/pdf/SocialNetworkSearch.pdf · Search in Social Networks with Access Control Truls A. Bjørklund, Michaela Götz, Johannes Gehrke

Search System Desiderata

yiipppiiiii

Hatiku hampa tanpa

cinta

  Want a system that:

  Given a query (set of keywords) returns top-k most recent posts containing these keywords

  adheres to the privacy settings of users (can only retrieve friends’ posts)

  makes new posts immediately searchable

  answers queries quickly

  does not consume too much space 3 KEYS 2010, Michaela Götz

Page 4: Search in Social Networks with Access Controlgoetz/pdf/SocialNetworkSearch.pdf · Search in Social Networks with Access Control Truls A. Bjørklund, Michaela Götz, Johannes Gehrke

Informal Problem Definition

  Given a social network   Nodes = Users   Edges indicate friendship (selflinks omitted)

  Given posts written by users (the authors)

  Answer conjunctive queries   Result of a query

  Top-k most recent posts that 1.  Contain all keywords of the query 2.  Are authored by friends

  Queries with access control

4 KEYS 2010, Michaela Götz

2 1

3 4

5

Page 5: Search in Social Networks with Access Controlgoetz/pdf/SocialNetworkSearch.pdf · Search in Social Networks with Access Control Truls A. Bjørklund, Michaela Götz, Johannes Gehrke

Design Space

  Two axes of enforcing access control:   Index axis:

  A group index contains the posts of a subset of users   An index design is a set of group indexes

  Access axis:   A group author list is a sorted list of pairs

<post-ID, author-ID> for a subset of users   An access design is a set of group author lists

  Intuition: Query processing

5 KEYS 2010, Michaela Götz

Page 6: Search in Social Networks with Access Controlgoetz/pdf/SocialNetworkSearch.pdf · Search in Social Networks with Access Control Truls A. Bjørklund, Michaela Götz, Johannes Gehrke

We distinguish designs based on:

6 KEYS 2010, Michaela Götz

Examples - Index / Access Designs

Index Design Access Design

Cardinality # of indexes # of author lists

Redundancy avg # of indexes a user is member of

avg # of lists a user is member of

Page 7: Search in Social Networks with Access Controlgoetz/pdf/SocialNetworkSearch.pdf · Search in Social Networks with Access Control Truls A. Bjørklund, Michaela Götz, Johannes Gehrke

Examples – Index Designs

7 KEYS 2010, Michaela Götz

Global Index   I1: No redundancy Lowest cardinality

High redundancy High cardinality

Friends Indexes:   I1:   I2:   I3:   I4:   I5:

2 1

3 4

5

Page 8: Search in Social Networks with Access Controlgoetz/pdf/SocialNetworkSearch.pdf · Search in Social Networks with Access Control Truls A. Bjørklund, Michaela Götz, Johannes Gehrke

Examples – Access Designs

8 KEYS 2010, Michaela Götz

Global List   L1: No redundancy Lowest cardinality

High redundancy High cardinality

Friends List:   L1:   L2:   L3:   L4:   L5:

2 1

3 4

5

Page 9: Search in Social Networks with Access Controlgoetz/pdf/SocialNetworkSearch.pdf · Search in Social Networks with Access Control Truls A. Bjørklund, Michaela Götz, Johannes Gehrke

Terminology

  Covers:   A set of group indexes covers a set of users if each user’s

posts are contained in at least one group index.   Exact covers: no posts of other users.

  A set of group author lists covers a set of users if each user’s posts are contained at least one group author list.   Exact covers: no posts of other users.

9 KEYS 2010, Michaela Götz

Page 10: Search in Social Networks with Access Controlgoetz/pdf/SocialNetworkSearch.pdf · Search in Social Networks with Access Control Truls A. Bjørklund, Michaela Götz, Johannes Gehrke

Query Processing with Access Control

  Given query t1, …, tm by user u 1.  Select indexes covering u’s friends. 2.  Select author lists covering u’s friends. 3.  Within each selected index:

a)  Intersect posting lists for t1, …, tm with the union of the selected author lists.

b)  For each result check whether u is friends with author.

4.  Union the results of the indexes

  Processing Optimizations:   Group indexes are exact cover no further filtering (skip 2., 3.)   Group author lists are exact cover no friendship check (skip 3.b))

10 KEYS 2010, Michaela Götz

∪indexesσfriends ((∩posting lists for t1, …, tm)∩(∪author lists))

Page 11: Search in Social Networks with Access Controlgoetz/pdf/SocialNetworkSearch.pdf · Search in Social Networks with Access Control Truls A. Bjørklund, Michaela Götz, Johannes Gehrke

Friends Indexes / No Group Author lists:   I1:   I2:   I3:   I4:   I5:

11 KEYS 2010, Michaela Götz

Examples – Index + Access Designs

High redundancy High cardinality Optimization: No author lists

2 1

3 4

5

∪indexesσfriends ((∩posting lists for t1, …, tm)∩(∪author lists))

Page 12: Search in Social Networks with Access Controlgoetz/pdf/SocialNetworkSearch.pdf · Search in Social Networks with Access Control Truls A. Bjørklund, Michaela Götz, Johannes Gehrke

User Indexes / No Group Author lists:   I1:   I2:   I3:   I4:   I5:

12 KEYS 2010, Michaela Götz

No redundancy High cardinality Optimization: No friends lookup

Examples - Index + Access Designs

2 1

3 4

5

∪indexesσfriends ((∩posting lists for t1, …, tm)∩(∪author lists))

Page 13: Search in Social Networks with Access Controlgoetz/pdf/SocialNetworkSearch.pdf · Search in Social Networks with Access Control Truls A. Bjørklund, Michaela Götz, Johannes Gehrke

Index:   I1:

13 KEYS 2010, Michaela Götz

Examples – Index + Access Designs

Author lists:

No redundancy Lowest cardinality

2 1

3 4

5

Global index / Global list

  L1:

No redundancy Lowest cardinality

∪indexesσfriends ((∩posting lists for t1, …, tm)∩(∪author lists))

Page 14: Search in Social Networks with Access Controlgoetz/pdf/SocialNetworkSearch.pdf · Search in Social Networks with Access Control Truls A. Bjørklund, Michaela Götz, Johannes Gehrke

Index:   I1:

14 KEYS 2010, Michaela Götz

Examples – Index + Access Designs

No redundancy Lowest cardinality

2 1

3 4

5

Global index / User list

No redundancy High cardinality Optimization: No friends look-up

  L1:   L2:   L3:   L4:   L5:

Author lists:

∪indexesσfriends ((∩posting lists for t1, …, tm)∩(∪author lists))

Page 15: Search in Social Networks with Access Controlgoetz/pdf/SocialNetworkSearch.pdf · Search in Social Networks with Access Control Truls A. Bjørklund, Michaela Götz, Johannes Gehrke

Index:   I1:

15 KEYS 2010, Michaela Götz

Examples – Index + Access Designs

Author lists:

  L1:   L2:   L3:   L4:   L5:

High redundancy High cardinality Optimization: No friends look-up

No redundancy Lowest cardinality

2 1

3 4

5

Global index / Friends list

∪indexesσfriends ((∩posting lists for t1, …, tm)∩(∪author lists))

Page 16: Search in Social Networks with Access Controlgoetz/pdf/SocialNetworkSearch.pdf · Search in Social Networks with Access Control Truls A. Bjørklund, Michaela Götz, Johannes Gehrke

Implementation - Overview   Main memory system in Java   Updates:

  Small updatable index to add new posts   hierarchy of indexes based on geometric partitioning with

compression

  Operators over lists:   Operators: Union, Intersection, Filter   Methods: Next(), SkipTo(value v)

16 KEYS 2010, Michaela Götz

Page 17: Search in Social Networks with Access Controlgoetz/pdf/SocialNetworkSearch.pdf · Search in Social Networks with Access Control Truls A. Bjørklund, Michaela Götz, Johannes Gehrke

Experiments   Comparison of the performance

  Across index designs   Across access designs   Across different social networks

  Performance measures:   Time to answer query   Time to add post   Space consumption

17 KEYS 2010, Michaela Götz

Page 18: Search in Social Networks with Access Controlgoetz/pdf/SocialNetworkSearch.pdf · Search in Social Networks with Access Control Truls A. Bjørklund, Michaela Götz, Johannes Gehrke

  Data:   Network:

  real twitter network: 417,000 users   synthetic networks (Barabasi’s attachment model) varying size

and degree

  Posts obtained from twitter   Queries

  generated through a random process   Run 100,000 queries returning top-100 posts

  Environment: 3.2GHz, 16GB RAM, Red Hat Enterprise 5.3

18 KEYS 2010, Michaela Götz

Experiments

Page 19: Search in Social Networks with Access Controlgoetz/pdf/SocialNetworkSearch.pdf · Search in Social Networks with Access Control Truls A. Bjørklund, Michaela Götz, Johannes Gehrke

0.25 0.5 0.75 1 1.25 1.5

·105

10!5

10!4

10!3

10!2

Number of documents

Processingtime(s)

Update cost

0.25 0.5 0.75 1 1.25 1.5

·10510!6

10!5

10!4

Number of documents

Processingtime(s)

Search cost

0.25 0.5 0.75 1 1.25 1.5

·105

102

103

104

Number of documents

Size(M

B)

Space consumption

Global indexFriends indexesUser Indexes

No access control

Scalability

Update Cost (s) Search cost (s) Space (MB)

Update Cost (s) Search cost (s) Space (MB)

Global / Global Friends / None User / None Global / None -> no access control

19 KEYS 2010, Michaela Götz

Varying the number of documents fix 1,000 users, 20 friends per user

Number of posts x 105 Number of posts x 105 Number of posts x 105

Page 20: Search in Social Networks with Access Controlgoetz/pdf/SocialNetworkSearch.pdf · Search in Social Networks with Access Control Truls A. Bjørklund, Michaela Götz, Johannes Gehrke

103 104 10510!5

10!4

Number of users

Processingtime(s)

Update cost (100 friends per user)

103 104 105

10!5

10!4

10!3

Number of users

Processingtime(s)

Search cost (100 friends per user)

103 104 105

103

104

Number of users

Size(M

B)

Space consumption (100 friends per user)

Global indexUser Indexes

No access control

10 50 100 150 200

10!510!5

10!4

Number of friends per user

Processingtime(s)

Update cost (10,000 users)

10 50 100 150 200

10!5

10!4

10!3

Number of friends per user

Processingtime(s)

Search cost (10,000 users)

10 50 100 150 200

103

104

Number of friends per user

Size(M

B)

Space consumption (10,000 users)

Global index User indexes No access control

Index Design Performance under Varying Networks

Update Cost (s) Search cost (s) Space (MB)

Update Cost (s) Search cost (s) Space (MB)

20 KEYS 2010, Michaela Götz

Varying the number of friends per user fix 10,00 users, 1mm posts

Varying the number of users fix 100 friends per user, 1mm posts

Global / Global User / None Global / None

Page 21: Search in Social Networks with Access Controlgoetz/pdf/SocialNetworkSearch.pdf · Search in Social Networks with Access Control Truls A. Bjørklund, Michaela Götz, Johannes Gehrke

Global index User indexes No access control

21 KEYS 2010, Michaela Götz

Access Design Performance under Varying Networks

104 105 106

10!5

10!4

10!3

Number of users in network

Processingtime(s)

Update cost (100 friends per user)

104 105 106

10!5

10!4

10!3

Number of users in network

Search cost (100 friends per user)

104 105 106

103

104

Number of users

Size(M

B)

Space consumption (100 friends per user)

Global-listUser-lists

Friends-listsNo access control

10 50 100 150 200

10!5

10!4

10!3

Number of friendships per user

Processingtime(s)

Update cost (100,000 users)

10 50 100 150 200

10!5

10!4

10!3

Number of friendships per user

Search cost (100,000 users)

10 50 100 150 200

103

104

Number of friends per user

Size(M

B)

Space consumption (100,000 users)

Update Cost (s) Search cost (s) Space (MB)

Varying the number of users fix 100 friends per user, 2.5mm posts

Varying the number of friends per user fix 100,00 users, 2.5mm posts

Global / Global Global / Friends Global / User Global / None

Page 22: Search in Social Networks with Access Controlgoetz/pdf/SocialNetworkSearch.pdf · Search in Social Networks with Access Control Truls A. Bjørklund, Michaela Götz, Johannes Gehrke

10!5

10!4

10!5

10!4

10!3

10!2

101102103104

Global-listUser-lists

Friends-listsNo access control

Experiment on Real Twitter Network Access designs with global index on real network

Update Cost (s) Search cost (s) Space (MB)

Update Cost (s) Search cost (s) Space (MB)

22 KEYS 2010, Michaela Götz

Global / Global Global / User Global / Friends Global / None

Page 23: Search in Social Networks with Access Controlgoetz/pdf/SocialNetworkSearch.pdf · Search in Social Networks with Access Control Truls A. Bjørklund, Michaela Götz, Johannes Gehrke

Conclusions   Two axis design space for access control in search:

  Index Axis   Access Axis

  Experiments with five designs:   Access designs reveal tradeoffs between index size, update and search

performance   Global Index / Friends lists

  fast searches (independent on network)   slow updates (dependent on network)

  Global Index / User lists or Global Index / Global list   slow searches (dependent on network)   fast updates (independent on network)

  Similar tradeoffs for index designs   Recommendation: Choose between user indexes and the global

index with user or friends lists based on workload and network

23 KEYS 2010, Michaela Götz

Page 24: Search in Social Networks with Access Controlgoetz/pdf/SocialNetworkSearch.pdf · Search in Social Networks with Access Control Truls A. Bjørklund, Michaela Götz, Johannes Gehrke

Future Work   Explore design space

  Identify best design for a particular workload and network

  Dynamic design   Adapt to changes in the workload   Adapt to changes in network

  Distribute system   Extend to more advanced ranking functions

  Include network structure and interactions as features   Do not leak private information through ranking

24 KEYS 2010, Michaela Götz

Page 25: Search in Social Networks with Access Controlgoetz/pdf/SocialNetworkSearch.pdf · Search in Social Networks with Access Control Truls A. Bjørklund, Michaela Götz, Johannes Gehrke

25 KEYS 2010, Michaela Götz

Thank You! Questions? [email protected]