60
Privacy management schemes for social networking sites Thesis submitted in the partial fulfillment of the requirements for the degree of Master of Science (By Research) in Computer Science and Engineering By NAGARAJA KAUSHIK GAMPA 200707017 [email protected] Center for Security, Theory & Algorithmic Research (C-STAR), International Institute of Information Technology, Hyderabad, India 500032.

Privacy management schemes for social …web2py.iiit.ac.in/research_centres/publications/download/...Acknowledgments I owe my deepest gratitude to Dr. Kannan Srinathan, Assistant Professor,

Embed Size (px)

Citation preview

Privacy management schemes for social networking sites

Thesis submitted in the partial fulfillment

of the requirements for the degree of

Master of Science (By Research)

in

Computer Science and Engineering

By

NAGARAJA KAUSHIK GAMPA

200707017

[email protected]

Center for Security, Theory & Algorithmic Research (C-STAR), International Institute of Information Technology,

Hyderabad, India 500032.

Copyright © Nagaraja Kaushik Gampa, 2010

All Rights reserved

Dedicated to my parents

International Institute of Information Technology

Hyderabad, India

CERTIFICATE

It is certified that the work contained in this thesis, titled “Privacy management

schemes for social networking sites” by Nagaraja Kaushik Gampa (200707017)

submitted in the partial fulfillment for the award of the degree of Master of Science (By

Research) in Computer Science and Engineering, has been carried out under my

supervision and it is not submitted elsewhere for a degree.

____________________ _______________________________

Date Advisor: Dr. Kannan Srinathan

Acknowledgments

I owe my deepest gratitude to Dr. Kannan Srinathan, Assistant Professor, CSTAR,

International Institute of Information Technology, who has been my advisor during the

course. He provided me with many helpful suggestions, important advice and constant

encouragement during the course of this work.

I would also like to thank to all my friends and CSTAR colleagues who have been

encouraging and supportive throughout my journey in IIIT-Hyderabad.

My special appreciation goes to my parents, Arvind Kumar Gampa and Meera Bai

Gampa, for their unconditional love which has given me the strength to try and achieve

more and to be a better person.

i

ABSTRACT

Security experts often say, users are the weakest link in a security system. Users misunderstand

how to use security mechanisms and do not realize the protection and are happy to circumvent the

security measures, if security measures try to impede their primary tasks. Attackers on the other

hand are experts in usability: they exploit user’s lack of understanding and their tendencies not to

comply with security protocols and policies by developing simple yet effective social

engineering attacks. Below we explain two main important problems the users are facing in the

social networking sites. The first one is identifying between strangers and friends and the second

one is adding the people who are interested in the community.

Current social networking sites protect user data by making it available only to the restricted set

of people often friends. However, the concept of ‘friend’ in social networks is illusory. Adding a

person to the friends list without verifying her identity can lead to many serious consequences

like identity theft, privacy loss, etc. We propose a novel verification paradigm to ensure that the

person who sends you a friend request is actually your friend and not someone who is faking her

identity. Our solution is based on what a person might know and can verify about the other

person. We work on a premise that a friend can say about her friend’s preferences better than the

stranger. The preferences include the interests of a particular person and the big five personality

traits1. To verify our premise, we did a two stage user study. Results of the user study are quite

encouraging.

The lifeline of a social networking site is its community relationship model. A community is a

group of people having similar tastes, interests and lifestyle. Communities can take the form of an

online forum, discussion group where the active participation of a user is required. As of now

there is no provision in the social networking sites to measure the interest of the members of a

particular community. Inclusion of uninterested users into the community will negatively impact

the quality of the community and its activities. Often it so happens that attackers disguised as

normal users try to join all the communities even if they are not interested in them. These users

1 Oliver P. John and Srivastava, S. The Big-Five Trait Taxonomy: History, Measurement, and Theoretical Perspectives.

In Handbook of Personality: Theory and Research (1999), pp. 102-138. University of California at Berkeley.

ii

are always a threat to the online community in the form of spammers, data thieves and malware

injectors. We propose a novel solution which filters out the users based on their interest in the

community. Our scheme is based on challenge-response scheme that works on the premise that a

user who is interested in the community will have knowledge about the community and can

answer to the questions about that community. We verified our premise with the help of user

study. The results of our study were encouraging thereby clearly demarcating the interested and

non-interested users.

iii

Table of Contents

Chapter 1: Introduction............................................................................................................... 1

1.1 Definition .......................................................................................................................... 4

1.2 History .............................................................................................................................. 4

1.3 Thesis Statement ............................................................................................................... 5

1.4 Let only the right one in..................................................................................................... 6

1.5 Sammelan ......................................................................................................................... 7

1.6 Overview of the Thesis .................................................................................................... 10

Chapter2: Let Only the Right One In: Privacy management scheme for social networks. ........... 12

2.1 Background ..................................................................................................................... 12

2.2 Related work ................................................................................................................... 14

2.3 Motivation ...................................................................................................................... 15

2.4 Challenge response schemes ........................................................................................... 16

2.5 Naive Approach: Verify about the sender ........................................................................ 16

2.6 Our Approach: Verify About the Receiver ........................................................................ 17

2.6.1 User verification using Preferences ........................................................................... 18

2.6.1.2 Advantages ............................................................................................................ 19

2.7 User study ....................................................................................................................... 19

2.8 Second phase user study ................................................................................................. 22

2.9 Results ............................................................................................................................ 22

Chapter 3 : Sammelan: Secure Communities of Shared Interests ............................................... 25

3.1 Related Work .................................................................................................................. 25

3.2 Motivation ...................................................................................................................... 27

3.3 Procedure ....................................................................................................................... 28

3.3.1 Traditional Approach ................................................................................................ 28

3.4 Challenge Response Scheme ........................................................................................... 29

3.5 Naive Approach 1 ............................................................................................................ 30

3.6 Naive Approach 2 ............................................................................................................ 32

iv

3.7 Sammelan ....................................................................................................................... 34

3.8 Method ........................................................................................................................... 35

3.8.1 Moderator’s side ...................................................................................................... 36

3.8.2 User’s side ................................................................................................................ 36

3.8.3 System’s side ............................................................................................................ 37

3.9 Security Analysis .............................................................................................................. 37

3.10 User Study ..................................................................................................................... 39

3.11 Results .......................................................................................................................... 40

3.12 System Usability Test ..................................................................................................... 41

Chapter 4: Conclusion ............................................................................................................... 44

References ................................................................................................................................ 46

v

List of Figures

Figure 1: (a) face to face communication (b) Communication by sending postcards (c)

Communication by telephone ...................................................................................................... 2

Figure 2: (a) email sending (b) use of instant messenger .............................................................. 3

Figure 3: Steps following while adding friends in social network .............................................. 15

Figure 4: Naive Approach of Verifying the Sender .................................................................... 17

Figure 5: Better approach of verifying the receiver .................................................................... 18

Figure 6: Liking for the items belonging to a) Sports category b) Interest category .................... 21

Figure 7: Prediction difference by participants for friend and a stranger ..................................... 23

Figure 8: Traditional approach of joining an online community ................................................. 29

Figure 9: Naive approach 1 for adding a user into the community .............................................. 31

Figure 10: Naive approach 2 for adding a user into the community ............................................ 32

Figure 11: Our Approach for a User to join into the community ................................................ 35

Figure 12: Moderator creating the community and entering keywords ....................................... 36

Figure 13: User choosing the keywords corresponding to the community .................................. 37

Figure 14: Graph showing difference in average scores of Interested and not interested users .... 40

Figure 15: Graph showing the differences in the user's scores in knowledge test ........................ 41

Figure 16: Bar graph showing opinion of users on 3 point scale in System Usability Test .......... 42

vi

List of Tables

Table 1: Most liked and disliked items for each category ...........................................................20

Table 2: Sample questtionare for second stage user study...........................................................22

Table 3: Guesses of participants about their friends and strangers ..............................................23

Table 4: Details of sessions........................................................................................................39

Table 5: Scores of the knowledge test ........................................................................................40

1

Chapter 1

Introduction

Humans are the intelligent species in this world. They invented many things and discovered

many. One of the most important things is the communication. A communication between two or

more people is called as a conversation. It is a social skill that most of the people have in them

which is not at all difficult. Conversations are ideal form of communication in some respects,

since they allow people with different views on a topic to learn from each other. A successful

conversation includes mutual interests between the speakers or the things the speakers know. For

this to know those people who are engaging in a conversation must find a topic in which both can

relate. The conversation can be on any topic.

The conversation can be done through different methods. As said a conversation is an ideal form

of communication. Communication is a process that involves exchange of information, thoughts,

ideas and emotions. Communication is a process that involves a sender who encodes and sends

the message, which is then carried via the communication channel to the receiver where the

receiver decodes the message, processes the information and sends an appropriate reply via the

same communication channel. Communication can be used through various processes and

methods and depending upon the channels used and styles of communication there can be various

types of communication. Let us take the different form of communication that can be categorized

through which people can communicate with each other. The types of communication can be

divided into

Verbal communication:

Verbal communication is further divided into two parts oral communication and written

communication. The oral communication refers to the spoken words in the communication

process. Oral communication can either be face-to-face communication or a conversation

over the phone or on the voice chat over the Internet. The other type of verbal

communication is written communication. Written communication can be either via snail

mail, or email. The effectiveness of written communication depends on the style of writing,

vocabulary used, grammar, clarity and precision of language.

2

Figure 1: (a) face to face communication (b) Communication by sending postcards (c)

Communication by telephone

Non- verbal communication:

Non-verbal communication includes the overall body language of the person who is speaking,

which will include the body posture, the hand gestures, and overall body movements. The

facial expressions also play a major role while communication since the expressions on a

person‘s face say a lot about his/her mood. On the other hand gestures like a handshake, a

smile or a hug can independently convey emotions. Non verbal communication can also be in

the form of pictorial representations, signboards, or even photographs, sketches and paintings.

The written form of verbal communication has been extended from postcards to Electronic mail.

Electronic mail, most commonly abbreviated email or e-mail, is a method of exchanging digital

messages. The foundation for today's global Internet e-mail service was created in the early

ARPANET and standards for encoding of messages were proposed as early as 1973. In the start

e-mails would only contain text messages that can be send slowly attachments can be added into

the email like any multimedia attachment or any document etc. After e-mails it has been slowly

shifted to the peer - peer messages between two people called as the Instant Messaging. Instant

messaging (IM) is a form of real-time direct text-based communication between two or more

people using personal computers or other devices, along with shared software clients. The user's

text is conveyed over a network, such as the Internet. Instant messaging is often called as online

chat. Of importance is that online chat and instant messaging differs from other technologies such

as e-mail due to the perceived synchronicity of the communications by the users, online chat

happens in real-time. This online chat was first introduced by AOL instant messenger. Some

systems permit messages to be sent to people not currently 'logged on' generally called as offline

messages, thus removing some of the differences between IM and e-mail. In some of the IM even

3

included the voice over chat and also video chats. They also have the chat rooms where people

interact with different people and make new friends if like each other. They would be chat rooms

for the conversations about some topic. Those chat rooms would only belong to that topic they

would discuss the details of that topic and also clarify each others doubts.

Figure 2: (a) email sending (b) use of instant messenger

After the online chats of different services like AOL, Yahoo, Google etc. which are all time

favorites of the people to communicate with each other in a faster way. The inventions of some of

the websites which are maintained by the individuals are known as blogs. Many blogs provide

commentary or news on a particular subject; others function as more personal online diaries. A

typical blog combines text, images, and links to other blogs, Web pages, and other media related

to its topic. The ability of readers to leave comments in an interactive format is an important part

of many blogs. Most blogs are primarily textual, although some focus on art, photographs, videos,

music, and audio. Micro blogging is another type of blogging, featuring very short posts.

Similarly there are forums that are mostly used for the discussion purposes. Most of the forums

consist of technical discussion about a topic and there would be comments for that topic and the

discussion continues.

Blogs became a personal website where people post about their personal events and day to day

life. But what about the people who doesn‘t have the technical knowledge of creating blogs So,

after chat rooms, blogs it became the new age of networking known as the social networking.

4

1.1 Definition

But what is a social networking site?

A social networking site is a web based site where people create a public or a semi public profile

within a bounded system, share there views to the people with whom they wish to share and make

a list of connections with those who are present within the range of the bounded system. Profiles

are unique pages where one can "type oneself into being". There are just the profiles the social

networking sites contains but also photos, videos, communities and the most important thing is

the friends list. We can also say that a social networking site is a medium of communication

between friends.

1.2 History

The first recognizable social networking site was launched in 1997. SixDegrees.com allowed user

to create profiles, list their Friends and, beginning in 1998, surf the Friends lists. The features of

the SNS exited before in some or the other form. For example dating sites consisted of the

profiles but does not contain the friends list. Instant messengers consisted of friends list or buddy

list which would be visible to only to the user but not to others. Classmates.com allowed people

to affiliate with their high school or college and surf the network for others who were also

affiliated, but users could not create profiles or list Friends until years later. SixDegrees.com was

the first to combine these features [6].

After joining a Social Networking Site, an individual is asked to fill out forms containing a series

of different questions. The profile is generated using the answers to these questions, which

typically include descriptors such as name, age, sex, location, interests, and an "about me"

section. The person would also be asked to upload a photo if necessary or if he is wished to. After

joining a social network site, users are prompted to identify others in the system with whom they

wish to have a relationship. The label for these relationships differs depending on the site—

popular terms include "Friends", ―Relatives‖, "Contacts‖ and "Fans." Most of the Social

Networking Sites provide users to leave the message for their friend. For example in Orkut it is

called as a Scrap, Facebook posting a message on their wall and in Twitter tweeting about self

and posting comments on the friends tweets etc. But the social network sites are not limited to

messaging, creating profiles, joining friends etc. It also consists of uploading the individual‘s

photos and also the videos which they can share with their friends.

5

The profile visibility of the individual varies from site to site. For example profiles on Friendster

was visible to anyone who either a member of the SNS or a non member. LinkedIn is a social

networking site which is mostly used for maintaining business contacts or professional contact.

The profile of the LinkedIn can be viewed only by the persons who have an account in that SNS.

MySpace which was first created for as a SNS within a college slowly developed as a commercial

site for all the people. It also allows only the users who have an account in the SNS.

The public display of connections is a crucial component of SNSs. The Friends list contains links

to each of the Friend's profile, enabling the viewers to traverse the network graph by clicking

through the Friends lists. On most of the sites, Friends list is visible to anyone who is permitted to

view the profile, although there are exceptions. For instance, some MySpace users have hacked

their profiles to hide the Friends display, and LinkedIn allows users to opt out of displaying their

network. The default visibility in most of the social networking sites is kept as visible to everyone

or friends of friends. Since most of the users who are using these social networking sites are

teenagers they care popularity more than security, so they neglect the privacy settings which are

also given by the social networking sites. They think more the number of friends more the

popular they are, so they add more number of user and don‘t think whether he is a malicious user

or not. Some of the malicious users can be dangerous and misuse the private data. By making

more and more people adding into the friend list your profile will become more and more public.

1.3 Thesis Statement:

The major goal of this thesis is to provide the best privacy management schemes for the users

who are using the social networking sites. For providing these schemes on the social networking

sites we focused on two of the major problems that are present in the present day social

networking sites:

The first problem is to make sure that a user in the social networking site should know

the person whom he is adding into his friend list, whether the person is a stranger or a

friend. Since by adding a stranger to our friends list it can dangerous since our private

data will be in the hands of the persons who might be a malicious user and misuse our

data. The personal data includes our name, date of birth, address, phone number, interests

etc.

The second problem is finding out whether the person who wants to join into an online

community in the social networking sites really knows about the community is interested

6

in the community and if he interested then only the person can enter into the community

of the online social networking sites. By doing so we can eliminate the people who are

not interested in the communities and the unwanted mails (these mails can be spam

mails) from uninterested people will be reduced.

The brief introduction these problems are given in Section 1.4 and Section 1.5.

1.4 Let only the right one in

Security experts often say that users are the weakest link in a security system [26]. Users

misunderstand how to use security mechanisms and do not realize the need for such a protection.

User behavior is essentially goal driven and security is usually a supporting task. Users are happy

to circumvent the security measures, if security measures try to impede their primary tasks.

Attackers on the other hand are experts in usability: they exploit user‘s lack of understanding and

their tendencies not to comply with security protocols and policies by developing simple yet

effective social engineering attacks. This problem is inherent in social networks. Social

networking sites are highly popular especially in teenagers. There exist many dedicated social

networks for different domains. For example, LinkedIn, Live Journal is business related social

networking sites while sites like Flickr offer easy and public photo sharing. However, in most of

the cases, social networking sites are meant for teenagers to stay in touch with their friends.

Friendship dedicated sites, like Facebook4 and MySpace5, are among the most popular websites

with more than 150 million users each and a growth rate of 3% per week [10]. This growth

bounds to attract many malicious people. Social networking sites are a rich source of sensitive

private data about millions of users. If such a data gets into the hands of malicious people then

there could be serious side effects such as identity theft, privacy loss, etc.

However, the need of socializing and the benefits that social networking websites offer are so

high that users often ignore the associated risks. Current social networking sites like Facebook

protect user data by making it semi-public [6]. A Semi public profile is visible only to the

restricted set of people often friends. However, the definition of friend is rather illusory in a

social networking environment. Each person can have different kinds of friends. Most notable

categories of friends are: Direct or close friends, acquaintances and Internet friends. The third

category: Internet friends, is the most troublesome category. This category includes friends whom

we met only on the Internet and never met personally before. Therefore, the only form of

communication with these friends available is via Internet. It also means that, there is no easier

7

way to verify their true identity, so they are no different from a stranger. Innocent users never

realize that many of them could be malicious attackers trying to steal their important data or they

could be sexual predators, lawyers, principals, etc. impeding their social privacy. Therefore

adding such people (strangers) as friends, without verifying is not advisable. However, teens

(who represent the mass population on social networking sites) do not share the same risk

perception. Most of them think friendship as a loyal and harmless event. Moreover, it is often

difficult to say `no‘ than `yes‘ when it comes to accepting a friend request. It is also a common

belief among the teenagers that more the number of friends they have, the more popular they

become [13]. Thus the default choice is `accept most of the friend requests‘. As a result, many

users have thousands of friends while with most of them users hardly speak. Users do not realize

that by confirming someone as a friend, they give the person the power to secretly view all the

contents of their profile, without being aware of when and what they view [29].

We propose a novel verification paradigm to ensure that the person who sends you a friend

request is actually your friend and not someone who is faking her identity. Our solution is based

on what a person might know and can verify about the other person. We work on a premise that a

friend can tell about her friend‘s preferences better than the stranger. To verify our premise, we

did a two stage user study. Results of the user study are quite encouraging.

1.5 Sammelan

Humans are social animals. Society is an integral part of a human life. Humans have been solving

complicated problems that they face in life with the help of fellow humans in the society. In a

society all the people try to create their own comfort zones called communities where they can

share their interest and problems with other members of the community. Today Internet has

transformed itself into a powerful tool with limitless applications and benefits. Internet has

become a virtual world in itself with people making real-life like profiles of themselves. Online

communities have become very important in the world of internet. These online communities are

providing the same benefits of the real world communities with much more ease, flexibility and

scalability. The main difference between the benefits of the online communities and the real

world communities is that in online communities accountability for the deeds of the community

members is very low or sometimes nil. If our assumption is that all the members of the

community are honest, then there is no need for their accountability. But this assumption is not

applicable in real life. With increasing popularity and people‘s dependence on the online

8

communities there is an increasing immediate need to solve whether the users are interested in the

community or not.

Social networking sites such as Facebook, MySpace, and Orkut are becoming popular amongst

the teenagers [13]. According to a recent study conducted by USC Annenberg's Center for the

Digital Future, nearly 60 percent of internet users over 50 years log in to an online community

every day, or even several times a day. This number is just 47 percent of members who use social

networking sites and are under 20 years old [28]. Social networking sites are mainly used to

interact with friends who may be the old friends and some are the new friends. Facebook alone

consists of about 400 million active users [10] and this number is increasing day by day. The

uniqueness of a Social networking site lies in what it gives its users. It not only allows strangers

to communicate with each other but also enable them to articulate and visit their social networks.

This generally results in connections between the individuals who share some common interests.

Most of the social networking site users are not looking out to meet new people but they are

trying to communicate with the people who are already a part of certain group [6].

Social Networking Sites are all about connecting with the like-minded people. The sense of

community makes social networking sites popular. The interactions in the community will

increase much more if the type of the community is a closed one that is if the community has the

restrictions of its own privacy settings for the members. But along with the benefits of these

communities some dangers lurk, just as in real world [28]. For example, a person may keep false

data in his entire profile. The possible dangers can widely vary from showing false age, stealing

someone‘s personal information to cybercrimes like identity theft, masquerading and use of

public information for wrong purpose.

With the advent of web 2.0 services the importance of communities has been dramatically

increased in the lives of internet users. The full potential of the communities is getting explored

with online services like blogs and online writing. Social networking sites have become our

second homes. People have embraced the idea of being in an online family called community.

This new era of online communities brought great changes in the lifestyles of many. Apart from

the advantages users get out of these communities, the potential for abuse is also high.

An online community has to cater the needs of its members to maintain itself over time [2]. The

willingness of the community to have an active participation in the discussions and the

commitment of its members are important factors for the success of any online community. This

9

property of online communities is very much similar to real world communities. The benefits that

the users seek should be provided by the online communities to stay alive in the cyber world [8].

The members who post for the first time to an online community are more likely to contribute to

the community when other members respond to their posts. For these new members in a

community receiving a response to their posts increases the probability of posting a second time

[15].

Community Hacking [8] makes a discussion of the threats faced by the online communities.

Community hacking is defined as the art of breaking into online communities and bending their

function in unintended ways. Mostly community hacking is motivated by personal gain to the

hacker. This form of abuse has to be taken seriously as it can hurt the quality of content and the

interactions of the community.

Communities are great for organizing on a personal level and for smaller scale interaction around

a cause. Communities are set up for personal interactions. Communities are directly connected to

the users implying the activities of the group may map to your personality. Generally online

communities are considered as extension to the user‘s profile. All the posts of a user on a

community will be attached to his personal profile. Creating a community is quiet simple with

only the type and the name of the group as required fields. Permissions settings make it possible

for community moderators to restrict the access to that group. The administrator of the group has

to set the access permissions for the group. The viewing access to community can be open to all

or can be restricted to its members. The join permissions on the community can be set to public,

closed or secret. Administrators have the task of managing the community, sending invitations

and approving the applications. Because of the inherit structure of community it offers more

control over who gets to participate over other means of mass communication. Updating

information to a community is simple. Email to the community will appear in the inbox of all the

members of the community.

The problem with online communities is that it offers little control to the moderator of the posts.

Even though they have granular moderation features it is not sufficient. If a post is considered to

be a spam it has to be deleted manually. Everybody needs privacy in their life and so do the users

of the social networking sites. Present day social networking sites are facing problems in privacy.

These privacy issues should not only be confined to the user profile but also to the groups which

10

he has joined. As of today importance is given to the profile alone and ignoring the privacy

problems in communities.

A more serious problem faced by the social networking site users is with the Groups created

under false pretense [9]. These types of groups are being used for financial gains of the owner of

the group. Authenticating the owner of the group becomes quiet difficult. This can only be done

based on the moderator‘s knowledge about the group/community. The problem is not only with

the genuineness of the community but the control of the community rests with an inappropriate

administrator who could misuse the member‘s information.

Attackers consider online communities as information harvesting grounds. Even though privacy

settings of the user‘s profile keep the naïve attackers at bay, these settings are not sufficient to

stop smart attackers. There are many issues like privacy and spamming in the online communities

which needs to be taken care of. Once the attacker is a part of the online community, he has the

access to all the members who have interest in a particular topic. Maintaining high quality

activities by closely inspecting the posts of the community members is a tedious job. The same

result can be achieved by a simple task of judging the users who joins the community. The

classification can be done based on the knowledge he has about the community, his interest and

the amount of effort that he puts in to get into the community.

Our scheme works on the premise that a user who is interested in the community possesses some

knowledge about the community and can answer the basic questions about that community. We

verified our premise with the help of user study. On an average we found that there is a huge

difference in the scores of the interested and non-interested users of the community.

1.6 Overview of the Thesis

The remainder of the thesis is arranged as follows. This thesis is divided into two parts that

corresponds respectively to the two major problems that we have discussed in section 1.3. The

first problem is ―Let Only the Right One In‖ and the second problem is ―Sammelan: Secure

Communities of Shared Interests‖.

Chapter 2 discusses the first problem ―Let Only the Right One In‖.

The first section of the Chapter 2 discusses the background work of the first problem and the next

section discusses the related work that has been done till date on that problem. Section 2.3

discusses we mention what motivated towards the approach and also discuss our approach to the

11

problem, before presenting our method we also presented a naïve approach which we modified

for a better usability in Section 2.5 and Section 2.6. The final 3 Sections (Section 2.7, Section 2.8,

Section 2.9) of Chapter 2 gives detailed results of the user study and the results of the user study

that we have conducted to prove our method.

The second half of the thesis that is Chapter 3 explains the second problem ―Sammelan‖.

The first Section that is Section 3.1 discusses the related work that has been done till date on the

second problem. In Section 3.2 we mention what motivated towards the approach. Before

proceeding towards our approach we also discussed two different methods and also the

drawbacks of that approach and finally we have come to a method that can solve the community

membership problem in Section 3.6, Section3.7 and Section 3.8. In Section 3.9, Section 3.10 and

3.11 we discuss about the security analysis of the method in how many different ways an attacker

can attack and also we have given the detailed results of the user study that we have conducted to

verify the method and prove it.

Chapter 4 concludes the thesis and also mentions some of the future work that can be extended

from our work.

12

Chapter2

Let Only the Right One In: Privacy

management scheme for social

networks.

2.1 Background

We have been using computers for socializing from about a decade. It started with e-mails, then

moved to chat rooms and now it is the era of social networking sites. Primary reason behind the

success of social networking sites has been the range of socially compelling services that they

offer. Three most important of those services [12] are: Identity, Relationships and Communities.

Identity: Social networking sites allow user to create a public or semi-public profile

inside a bounded system. The profile pages include fields for personal details, favorite

forms of media and things that users want to tell about themselves to others. Biggest

advantage of profile pages is that it let users to say who they are and in a manner they

want.

Relations: Social networking sites let user to make new friends, rediscover old friends

with whom they had lost touch and stay in touch with the current ones. An average user

on Social network has around 120 friends [10]. Act of adding friend is often bidirectional

which requires mutual approval. After the friend request is accepted, user is able to

traverse accepted friend‘s connections. By browsing through user can meet or find new

friends. In this way, network of friends grows and is often measured in millions [10].

Communities: Third compelling factor about social networking site is the integration of

communities in social networks. Like minded people come together and start

communities on related subject of interest. The aim of the community is to share

thoughts, help each other and learn about the common community theme. Communities

range from public fan clubs of favorite celebrities to active forums discussing technical

subjects like C++, to social discussion on subjects like Poverty. Most communities are

13

kept public for anybody to join and learn about the subject. Users can achieve social

respect and importance within the bounded community with their sharing and thoughts.

According to Maslow hierarchy of human needs [21], the urge to sociality is highly motivating

force. Above social compelling services often make users to undermine (neglect) the associated

risks. A fully filled-out profile is a rich source of personally recognizable information. Such

sensitive data can often be misused once got into the wrong hands.

We discuss below, possible threats by adding a stranger as a friend.

Loss of privacy: Most of the user do not read the privacy policy and those who read do

not understand them. Often users assume that if there is a privacy policy then the site

safe [24]. User share their embarrassing moments, mistakes they made, their love life

and other sensitive personal details on social networks. These details are visible to the

friends. However, users do not recognize the person that they have added as friend is not

the one they thought before. He can be the one faking somebody‘s identity. He could be

school principal or future employee monitoring your profile for your character or he

could be a prankster from your college. In this way there is huge amount of privacy loss

once the data is public.

Identity theft: Most financial institutions rely on security questions as fallback

authentication, in case user has forgotten her password. These security questions are

based on User‘s personal life history, family background etc. However, in social

networks, this kind of sensitive information is freely available [25] for an attacker to

launch successful identity theft attack. However to view this data, either the profile must

be public or attacker must be in the restricted list of people (friends) that has access to

this information. A successful attacker‘s strategy is described in [4] where an attacker

fakes an identity of a person and infiltrates her friend‘s networks. Since her friends can

not easily detect the impersonation, adding such fake profile will leak their sensitive data

for misuse, often for an identity theft. Thus, once an attacker has successfully penetrated

into a network, she can access all the profiles and other data and use it for malicious

purpose.

Criminal Activities: Social networking sites are often used for criminal activities such as

sexual predating, kidnapping and asking for a ransom amount, etc. To launch such an

attack, a malicious person creates a fake profile for himself and sends the flattering

14

friend request to many teenagers in the local neighborhood. Most of the teenagers do

reply positively to this type of requests. Once the malicious person has been added into

the friends list of the teenagers, he tries to influence with his behavior and become closer

to him. Once he has extracted enough personal data such as permanent address, contact

numbers, etc. of the victim then he might reveal his/her true identity. The extracted data

can further be used for severe criminal activities.

Above attacks demonstrate severe risks associated with posting personal data on social networks.

So far, the best defense against these attacks is not to reveal any sensitive and personal

information on social network. However, this defeats the purpose of social networks, which is

healthy exchange of ideas across the connections. We below describe related countermeasures to

protect privacy.

2.2 Related work

Social networking sites are used for connectivity of friends. This connectivity between the friends

is based on the trust they have on one another. However, the growth of social networks has also

attracted malicious adversaries. In recent times multiple techniques have been implemented to

protect the data over the Social Networking Sites. First solution is based on identifying the honest

nodes [35, 34]. Honest nodes are the ones with good connectivity with the rest of the social

networks. However, as we described in earlier section, it needs one infiltration to damage entire

network of trust. Results show that users blindly accept the request from forged identity that is

already confirmed by their friends [4]. Another solution is based on Data Perturbation [31] used

to modify the user data such that it no longer represents the real individuals. Lucas et.al [19] has

proposed an encryption scheme that presents the user data in an encrypted form. However the

solution needs the distribution of encryption keys which is an overhead. Another similar solution

is by securing the social networking API through proxies [11]. A related solution to the privacy

problem is using the shared knowledge between the persons [30]. However, this solution requires

active involvement from both the parties to work. A person must define separate challenge

questions for every new person and the data to be protected. Another technique that is being used

is CAPTCHA [1]. A Captcha is a program to detect automated bots attacks and to differentiate

between a computer and a friend. A CAPTCHA based technique is proposed in [33].

This solution requires users to identity content (person‘s name) of an image. However, user‘s

appearances do change from time to time; therefore, it might be difficult even for a legitimate

15

user to identify the person in the image. To summarize, we briefly described the related attacks on

privacy and how related solutions tries to prevent them from happening. Before describing our

actual design, we would like to explain what has motivated us to use the approach we have

presented in this thesis in Section 2.3. After the motivation the chapter also includes the naïve

approach that can solve the problem and then came to the main method which was better than the

naïve approach in solving the problem.

2.3 Motivation

We first describe how friends are formed (or linked together) in a social networks. We follow

Facebook model, where it involves three basic steps as shown in Figure 3. For clarity, we name

the persons involved in interaction as Alice and Bob. The malicious user who is faking bob‘s

identity is called Mallory.

Figure 3: Steps following while adding friends in social network

Bob first sends a friend request to user Alice. A request contains profile summary of the sender

(Bob) which includes name of the sender, his profile photo, short summary and names of mutual

friends (if any). Alice before accepting the request can also see and verify user Bob‘s profile.

Alice can also talk with Bob by sending messages to which Bob can reply.

However, Alice hardly checks and verifies the profile of Bob. Alice normally bases her judgment

on the profile summary that comes with the request. Default tendency is ‗to accept‘ if Bob and

Alice share some mutual friends [4]. However, all these data is easy to fake as it can be mined

through web and public records. Thus, an innocent careless Alice can easily be tricked into

16

accepting a request from a fake profile. To mitigate such attacks, attacker‘s identity must be

verified. A possible way of doing it is with challenge response schemes.

2.4 Challenge response schemes

Challenge response schemes are popular means of fallback authentication in cases where users

have forgotten their passwords [16]. In the challenge response schemes, system verifies whether

the user knows (remembers) the answers to the questions mostly about their personal life e.g.

mother‘s maiden name, their pet‘s name, date of birth, etc. It has been believed that answers to

these questions are not available in public records and only the legitimate person knows the

correct answers. However, this assumption has recently been proved faulty [25, 23]. It has been

observed that, the challenge response schemes are particularly vulnerable against insider attacks,

i.e. family members, friends, ex-girl friends and acquaintances seems to know answers of many

such questions. Therefore, we ask the question:

―If family members and friends know answers of many personal challenge response questions

then why can‘t we use these questions to authenticate them instead of the user?‖

We can thus verify the person who sends the friend request (Sender) using Challenge Response

schemes in following two ways:

1) Ask the sender, the questions related to his/her own life.

2) Ask the sender, the questions related to the life of the person (receiver) to whom the friend

request has been sent.

In the next two sections we discuss the both the approaches and argue why the second approach is

better.

2.5 Naive Approach: Verify about the sender

Alice can ask Bob to prove his identity by asking him some challenge questions answers of which

only Alice and Bob know and no one else. If Alice is satisfied with the answers, she will accept

Bob‘s request else decline. Steps followed are summarized in Figure 4.

However, this solution will work only if the following two conditions are satisfied. 8 Let Only

The Right One In: Privacy Management Scheme for Social Networks

1) Question forming should be automated and should require minimal effort from Alice.

17

2) Answers to these questions should not be publically available.

Figure 4: Naive Approach of Verifying the Sender

However, this solution will work only if the following two conditions are satisfied.

1) Question forming should be automated and should require minimal effort from Alice.

2) Answers to these questions should not be publically available.

However, the naïve approach has problem with both the conditions. First, finding appropriate

challenge questions automatically for every other user (Bob in this case) is difficult. Alice must

constantly be involved in the process of question forming and then again for the verification. It is

certainly a big overhead for innocent user like Alice. A perfect solution should expect most of the

work from malicious Bob (actually Mallory who is impersonating Bob) than Alice. Secondly,

since the identity of Bob is already forged, (Mallory has successfully mined bob‘s information

and profile details), chances that Mallory might able to correctly answer some of the questions

from the mined data can not be ignored.

2.6 Our Approach: Verify About the Receiver

Alice can instead ask Bob to prove what Bob knows about her by asking him some challenge

questions answers about her life. If Alice is satisfied with the answers, she will accept Bob‘s

request else decline. Steps followed are summarized in Figure 5.

There are two distinct advantages with this approach. It requires minimal efforts from Alice‘s

side. Alice can prepare a set of challenge questions concerning her and ask a subset of them for

any friend request that comes.

18

Figure 5: Better approach of verifying the receiver

Thus there is no need to separately preparing the questions for new friend request and verifying

thereafter. Secondly questions forming can be automated to a certain degree. However, if can be

easily predicted that the answers to these questions can not be mined through searching public

records. We desire such a mechanism, or set of questions that are not available online. We

therefore, looked into preference based authentication schemes [14].

2.6.1 User verification using Preferences

We therefore, design our scheme around user preferences. Each person has unique set of likes and

dislikes for range of items. We strongly believe that a friend can tell about her friend‘s

preferences better than the stranger [3]. Our proposed scheme works in three simple steps. For

clarity, we use again the entities Alice and Bob and Bob wants to become a friend of Alice.

2.6.1.1 Steps:

1. Building preferences: Alice builds a list of preferences for number of different

categories. In this prototype, we choose, following categories: Sports, Movies, TV

Shows, Hobbies, Music, Video Games, and Food etc. A special category about

personality is also added using the big five personality dynamics [23]. The personality

traits of a person are generally known to family members and others who share a long

term relationship with the friend. We ask Alice about her preferences (likes and dislikes)

for number of different items belonging to each category. We save all these preferences

into a secure database. This step happens generally at the time Alice registers on a social

network.

19

2. Verification Test: When Bob tries to send Alice a friend request, we pick a random

subset of items from a preferences database of Alice and ask Bob to identity Alice

preferences for these items. Bob then tries to answer maximum of those questions.

3. Result: The results of Bob performance test is shown to Alice along with his Profile

history. It is up to Alice, thereafter to Accept or reject Bob‘s request.

2.6.1.2 Advantages

We list below, the distinct advantages of the proposed design.

User preferences are generally not available online [love]. Thereby probably safe against

data mining of public records.

The scheme gives minimal overhead to the person who receives the friend request (Alice

in this case).

The scheme can be easily automated once the preference database is formed.

The scheme is simple and easy to understand for users.

To test our approach we have done a user study and the results of the user that are shown in

Section 2.7.

2.7 User study

We test the viability of our approach we conducted a two phase user study. In the first phase, a

pilot study was conducted on a group of 75 student volunteers with their age in the range of 19 to

28. Out of the 75 participants, 49 were male while 26 were female. Monetary incentives were

provided to avoid the cold start. The questionnaire was prepared with eight categories namely:

Sports, Video Games, Music, Hobbies and interests, Food, Movies, TV shows and Academic

subjects of interest. Each category comprises of 12 to 14 items. Participants were asked to

respond with their likes and dislikes for each item within the given set of categories. In total, taste

performances for 134 items were reported.

The aim of the pilot study was to gather gender wise taste performances of the participants. In

particular, we were interested to know two things:

1) Commonly liked and disliked items and

2) Correlations among the liked and disliked Items.

20

The questionnaire was prepared with specifically college students in mind and is of demographic

in nature. Therefore some of the questionnaire can be changed to better suit the desired users. The

results of the taste performances were then used to prepare a questionnaire for the main (second

stage) user study.

Table 1: Most liked and disliked items for each category

Category Most Liked Most Disliked

Sports Cricket, Billiards Swimming, Boxing

Video games Age of empires Arcade

Interest Books, Computers, Politics Fashion designing, jewelry, religion

Music Rap Heavy metal

Food Fruit juices, Home food Italian

Table 1 shows the results of the most liked and disliked items belonging to each category. These

items are easily predictable since they are liked or disliked by most of the participants. We

therefore, eliminated them from the second stage questionnaire. For example, in the sports

category, 87.09% of the people liked cricket which was quite obvious since user study happened

in India while 85.62% of them dislike boxing which was also relatively good estimate with Indian

public.

Figure 6 shows the graph that has been splited into two parts where left part shows liking for each

item in the sport category and right part shows the liking for the items in the interest category. We

can observe that, items like volley ball and basket ball are liked equally by the participants.

Similarly items like sleeping and travelling got similar liked votes.

21

Figure 6: Liking for the items belonging to a) Sports category b) Interest category

To summarize, after the first stage, we were able to eliminate items that can be easily guessed.

The academics subject of interest category was removed since it was easier to guess based on the

participants background. We also eliminated 12 items from other categories. As a result, finally

42 items were removed from the first stage questionnaire, leaving back 92 items.

We then combined the items that have equal liking or disliking responses. Our motivation behind

doing it is to improve the usability as well as the security. To clarify it better, let us take an

example of items: chess and carom from sports category. We found that while 50.94 % of

participants liked chess while 48.16% participants loved caroms. Now if we combine these two in

a single question and ask user that which one or both among them you liked the most, then user

wont feel the burden of answering two questions and security is also improved in a sense

attackers can not easily identify whether the legitimate user likes one or both or neither of these

two items (security is improved from 2 bit to 4 bit).

22

2.8 Second phase user study

At the end of first stage user study, we were left with 92 items that after combining created 48

questions. We then introduced a new category ‗Personality‘, loosely based on the Big Five

personality factors [23] of a person into second phase.

This phase was conducted on a different set of volunteers which consisted of 32 volunteers out of

them 20 are male and 12 are female volunteers. These set of 32 volunteers has been chosen in

such a way that for each and every participant there would be a friend of the participant and a

complete stranger to the participant. The format of the user study was paper based. We asked

each user to fill three questionnaires one for herself, one for his/her friend and one for a stranger

whom she does not know personally. A sample set of questionnaire that was asked to the

participants is shown in the Table 2.

Table 2: Sample questionnaire for the second stage user study

Category Items

Interest

Dancing Or Singing

Travelling Or Sleeping

Reading Or Writing

Participants were free to choose either or both items as likes. Items that are not selected are

considered as dislikes.

2.9 Results

We collected the forms from all the participants and cross checked the entries written for the

friend as well as the strangers with their original answers. The analysis has been carried out in the

following manner: If the participant has guessed correctly whether both the items are liked or

either of the items is liked. Then we rewarded them 2 points but if is participant is able to guess

only one correct answer, i.e. if the friend or stranger has liked both the items of single row and

participant has guessed only one then we gave 1 point. If neither of the items is correctly guessed

then we gave no points for wrong guess. Our results show that a participant is able to correctly

23

guess about her friend for 45.86% time whereas she can only able to guess for 30.69% of a

stranger. There exists a distinct gap of 15.17% that distinguishes between a stranger and a friend.

We can see the results in Table 3.

Table 3: Guesses of participants about their friends and strangers

Total no. of participants Guess about Friends Guess about Strangers

32(20 male + 12 female) 45.86% 30.69%

After getting the total results of how much a person can distinguish between a friend and a

stranger. The next aim was to find out in which one of the categories there was much of the

differences in guesses for friend and stranger. Figure 7 which is a graph shows the difference of

percentage between the friends and the strangers in each category.

Figure 7: Prediction difference by participants for friend and a stranger

As we can see from the graph, the personality category shows the maximum difference of 21.43

% between a friend and a stranger. It was obvious that the personality of a person can be better

known to their friends than to a stranger. The next category that shows big difference in guesses

was the interests, where there exist 17.43% difference between a friend and a stranger.

In a similar manner, the two categories where guesses about friends and strangers collide are

music and the movies categories. We thus can say that these two categories are not good

estimates of differentiation among friends and strangers.

24

We may conclude that our theory of distinguishing friends from the stranger using like and

dislike preferences will work given that there exist a distinct gap of 15% or more in the guesses

for the friends and the strangers.

25

Chapter 3

Sammelan: Secure Communities of

Shared Interests

3.1 Related Work

To the best of our knowledge there is no prior work which discusses the community membership

problem. Community membership problem is to decide whether the user really belongs to the

community or to decide if he may become a good member in the community. Till date research

was focused on the privacy issues concerning personal data which is on the social networking

sites.

Challenge questions have become an important secondary mechanism of authentication [17]. This

is a form of knowledge response scheme which is being used for fallback authentication in case

the user forgets his/her password. The assumption in this scheme is that querying for known

information will be more useful than querying for memorized information. This paper shows that

alternative authentication schemes can be used in place of conventional authentication schemes.

Following this paper we can infer that knowledge testing mechanisms can be used for

membership problems.

Cliff et.al in [18] proposed the scheme to create a neighborhood for the user. In this scheme

profile elements are used as signals. In this paper the author empirically proves that certain profile

fields are more prominent to predict friendship/relationship. These profile fields were termed as

signals. Based on these signals a neighborhood is defined for a user. This same neighborhood

may be used to define a group or a community for the user in vague terms. This kind of

community is personal or customized to the user. The paper does not talk about the application of

this scheme to the groups /community membership problem.

Recommendation systems and social navigation systems can be of great help to the user by

narrowing down the wide variety of choices he has and suggesting the navigation path that

26

matching his profile. Incidentally the same helpfulness does not come to the rescue of the

community/group owners.

Philip et.al in [5] proposed a scheme to improve online SNS recommendation systems. In this

paper profile similarities like age, gender, profession, hobbies were used to cluster together the

users having same interests and same tastes. Rating overlaps were considered as training model

for prediction. Feed back in the form of a questionnaire was taken from the user to correct the

prediction model. This work may be extended to recommend groups to the user but the author did

not explore this possibility.

Michael et.al in [30] discusses about the shared knowledge scheme to allow a user to be a part of

a certain group. The observation made by them is that social cliques will have similar regions of

knowledge. The paper explores the possibility of using this observation to create ad-hoc groups or

communities restricting the resources to a specific section of users. Guard questions were

designed for each resource which acts like a lock to that resource. All the users who have the

answer will form a virtual group/ community with access privileges to that resource. In this

method the owner has to put in the effort of designing questions for each resource. It is difficult to

deploy this method on large scale because the effort and complexity of designing the guard

questions will also scale with the system.

The user‘s trust in other community members and the community‘s information sharing norms

has a negative impact on the community specific privacy. Oden and sunil in [22] talk about the

effect of privacy on the participation of the user in online communities. The paper says that the

user‘s privacy concern is inversely proportional to the amount of contribution that he makes and

this makes the user more restrictive.

Spamming has taken new forms in the recent times. Online community users are becoming the

new victims of this trend. Social networking spam is directed at users of internet social

networking services such as MySpace, FaceBook or LinkedIn [27]. Messages with embedded

links to other commercial sites or other SNS are an increasing nuisance. These spammers are

utilizing the tools of SNS or users of certain groups to post their spamming messages. All this is

done by the spammers who impersonate themselves as legitimate users. With some social

networking sites giving permissions to deploy user developed applications on the site, spammers

are using this feature to build applications which collect the information from the user‘s profile

and spam their inboxes.

27

There is an alarming rise in attacks on users of social networking sites by spammers and malware

writers. Reports show that the focus of the cyber criminals has changed to the social networking

users in the form of spam and malware propagation [20]. There has been a rise of 70 percent in

the users reporting spam through their social networking sites and a 69 percent raise in users

complaining about malware attacks from social networking sites.

The likelihood of infecting a computer using social networking sites as medium is much more

than conventional methods. When compared to the other methods the social networking way of

spamming was 10 percent more effective [32]. The same article says that exploiting social

networking sites is not new. Reportedly 350,000 spam mails flooded inboxes claiming to be from

Facebook recent times. These mails had malware attachments which tried to compromise the host

systems.

3.2 Motivation

Social networking sites contain a wealth of information. Attackers seriously consider online

communities as information harvesting grounds. Even though privacy settings of the user‘s

profile keep the naïve attackers at bay, these settings are not sufficient to stop smart attackers who

exploit the freedom given in an online community. Profile information leakage from the

community is not being considered as a serious threat by the users. There are many issues like

privacy and spamming in online communities which needs to be taken care of. Once the attacker

is a part of the online community, he has access to all the members who have interest in a

particular topic, which can be used by him for financial gains. In the form of discussions and

questions the attacker can gain more information and interests from the users of the community,

fine tuning his attack. These risks and threats involved grow linearly with the size of the

community. The quality of the interactions in the community is determined by the quality of the

members in the community. Activity in an online community can be kept high or maintained well

with in the interest ranges of the users by closely inspecting the posts of the members. This is a

tedious job to do. This job can be made easy by shifting the inspecting window from checking the

posts to judging the users that join the online community. All this discussion boils down to a

single problem, the community membership problem. To justify whether a user really belongs to

a community or not is the task to do. The problem can be extended to the users who are willing to

be a part of the community but are not a part of that community. The classification can be done

based on the knowledge he has about the community or based on his interest and the amount of

28

effort that he puts in to get into the community. Primarily our focus is on the communities which

are closed. The definition for the term ‗closed communities‘ is given in the next section.

3.3 Procedure

In this section we will explain various ways to handle the community membership related

problem.

3.3.1 Traditional Approach

This section will describe the general procedure to create a community in the social networking

sites. We follow the Facebook model in this case.

This section will describe the general procedure to create a community in the social networking

sites. We follow the Facebook model in this case. Only a member of the social networking site

can create a community. In Facebook a community is termed as ―Group‖. The user will be given

an option to create the group in the Groups section. He will be prompted for the group name and

the category to which the Group belongs to. The moderator should give a small paragraph

describing the Group. This is useful for the new users who wish to join the group. For example if

a user is a fan of Michael Jackson and wants to create a Group, he can give the group name as

‗Michael Jackson Fans Club‘ and has to give a brief description about the group. Under the

Group type he has to choose ‗Entertainment‘. The sub-category for this Group can be chosen as

‗celebrity‘. Further the user can give any links that belong to the community. The owner has to

create a public face (profile) for the Group by giving his personal details. A public face is a

stripped down version of his actual profile. He could be the moderator himself or make any other

user the moderator for the Group. In the privacy settings of the group the moderator can choose

the group to be ‗open‘, ‗close‘ or ‗secret‘. The definitions for open, closed and secretive groups

are as follows:

Open: Anyone can join and invite others to join. Group info and content can be viewed

by anyone and may be indexed by search engines.

Closed: Moderator must approve requests for new members to join. Anyone can see the

group description, but only members can see the Wall, discussion board, and photos.

Secret: The group will not appear in search results or in the profiles of its members.

Membership is by invitation only, and only members can see the group information and

content.

29

The traditional approach for a user in social networking sites to join into a community is very

simple and is explained in steps in Figure 8.

A user of the social networking site who wants to join into the Group has to send a

request to the moderator of the Group.

In this case if Alice wants to join the Group then she has to click ―join the community‖

which makes the Alice send a request to the moderator of the community.

The moderator can either accept the request or reject it. The decision solely depends

upon the moderator. There is no procedure involved to decide, it‘s simply accepted or

not.

Figure 8: Traditional approach of joining an online community

It is very easy for the attacker to get into the community and misuse the resources of the

community when traditional approach is employed. In this case Alice, who is a user of the social

networking site, may be a malicious person trying to exploit the community. To minimize the

threats of the community, modification has to be done to the traditional approach. Introducing

changes in the process of creating the Group and joining the Group are required. The aim of the

modification is to make it difficult for the attacker to enter into the Group.

But the difficulty of the moderator is to decide if the user has to be accepted or rejected into the

community and how to judge the user. If Alice sends a request, it is very difficult for the

moderator of the group to justify himself if he is adding correct person into the community or not.

One way to solve this problem is to use challenge response scheme.

3.4 Challenge Response Scheme

Challenge response schemes are used for fallback authentication which is used in cases where

users forget their passwords [16]. In the challenge response schemes, system verifies whether the

30

user remembers the answers to the questions which were asked during the time of registration to

the social networking sites like Facebook or Orkut. The questions are mostly asked about the

personal life of the user e.g. mother‘s maiden name, their pet‘s name, date of birth, etc. It has

been believed that answers to these questions are not available in public records and only the

legitimate person knows the correct answers. This method is the basis for our approach. If a

person is interested/belongs to a Group then he should be able to clear the Challenge given to

him. The scope of this challenge will be confined only to the Group.

The Challenge Response schemes for the verification of the user can be done in the following

ways:

1. Ask the User questions related to the community he/she wants to join.

2. Ask the user a set of ‗keywords‘ that belong to the community and verify his answers.

We describe both of the above approaches and discuss which one is a better approach.

3.5 Naive Approach 1

In this approach the moderator has to create a Group similar to the traditional approach. But after

creating the Group the moderator has to design a set of questions which are based on the Group

and are related to the Group. If a user wants to join a Group, in this case Alice, then she has to

follow the following steps. This is shown in ‗Figure 9‘.

Alice first sends the request to ―join the Group‖.

Alice would receive a set of questions that the moderator has prepared during the time of

creating the community.

The questionnaire would be a random set of questions that have been picked from the

total number of questions prepared by the moderator. Let‘s say 15 questions were chosen

from a set of 50 questions.

Alice will reply to the questions given.

The moderator has to validate the answers for the questions posted.

The moderator then replies to the user with either ACCEPT or REJECT based on the

answers to the questions.

31

Figure 9: Naive approach 1 for adding a user into the community

One of the problems in this approach is that the moderator has to correct the answers of each and

every user. The complete burden of validating the answers is on the moderator. This task may

seem to be practical and simple at the first glance but as the number of user requests increases this

will turn out to be an unimaginably tedious task to do.

For example let us take a Group of ‗Mahatma Gandhi‘. If Alice wants to join the Group then she

would get the questions such as

Q1) What is the name of Mahatma Gandhi‘s mother?

A1) ______________

Q2) Mahatma Gandhi was born on

A2) ______________

Q3) In India, Mahatma Gandhi is popularly known as

A3) ______________

Alice will reply to the questions. Moderator will validate the questions. If the moderator feels that

Alice has faired well enough in the challenge with good number of correct answers and if

moderator thinks that Alice is eligible to join the Group, he will send an ACCEPT message to

her. If the moderator thinks otherwise he will reply Alice with a REJECT message.

32

3.6 Naive Approach 2

To overcome the problem of correcting the answers every time we have used the approach of

checking the answers automatically. The steps involved are as follows:

The moderator will give a set of questions and their respective answers at the time of creation of

the Group. This procedure is explained in ‗Figure 10‘.

Alice first sends the request to ―join the Group‖.

Alice would receive a set of questions that the moderator has prepared during the time of

creating the community.

The questionnaire would be a random set of questions that have been picked from the

total number of questions prepared by the moderator.

Alice will reply to the questions given.

The system will validate the response of Alice. Responses will be considered to be

correct only if answers match exactly with the moderator‘s answers.

If the number of correct replies from Alice exceeds a predefined threshold T, the system

will reply to her with ACCEPT message or else REJECT message is sent to her.

Figure 10: Naive approach 2 for adding a user into the community

33

Let us take a Group, ‗Cricket‘. The following questions designed by the moderator can be taken

as example:

Q1) Who has taken the highest number of wickets in Test Cricket?

A1) _____________________

Q2) Who is the only batsman having the ‗highest average‘ in both test and ODI forms of the

cricket?

A2) _____________________

Alice will reply with the answers. Let us assume that Alice replied with the following answers:

Q1) Is ―Muttiah Muralitharan‖ and Q2) is ―Sir Donald Bradman‖.

After answering to these questions if number of correct replies exceeds the threshold T, Alice

passes the test. She would be added into the Group with ACCEPT message. The structure of the

answers given by the users may differ considerably with the structure of the moderator‘s answer.

This difference in the structure of the answers could just be in the syntax. In the above example if

the user replies either with ―Muralitharan‖, ―Muttiah‖ or ―Mularidaran‖ the answer is valid. But

any reply other than ―Muttiah Muralitharan‖ is considered as an invalid answer by the system

because of its syntactic difference. Instead of checking for the semantic similarity the system is

facing difficulty checking for the syntactic similarity.

To eliminate this problem we designed the set of questions with multiple choice answers. For

each question we gave 4 options and one among these 4 will be the correct choice. This method

greatly reduced the problem faced by syntactic differences in the answers but the effort of the

moderator/ Group‘s owner was considerably increased. Not only he has to design the questions

but also has to design equally good answer choices for the questions. The problem with the bad

answer choices is that the user may find out the correct answer by elimination. So the answer

choices should be equally complicated for the user who doesn‘t know the answer but a user who

knows the answer should be able to pick them easily.

To reduce the complexities involved in the multiple choice format questions we further improvise

the method. The improvised method is discussed in the next section.

34

3.7 Sammelan

In our approach the social networking site user who wants to create a community would be asked

the same details as in the traditional approach like name, category, subcategory and description.

Instead of designing the questions and answers for testing the knowledge, the user has to give a

set of KEYWORDS at the time of creation of the community. This is the basic difference of this

approach from the other approaches. The keyword for the community is a word which

characterizes the community or a word which is closely related to the community. The keywords

can vary from a small word to a sentence. In general a keyword is mentioned in one or two words

which describes about that community.

For example: Let us take a community ―C++ queries‖. This community comes under the category

‗education‘ or ‗computer and internet‘. The category under which a moderator places the

community is entirely up to him. After placing the community under a certain category he would

be asked to give the keywords for the community that will describe the community in the best

possible way. These keywords should be random words for a user who doe not know about C++.

The keywords would be of like ―OOPS‖, ―PURE‖, ―STRASTRUOP‖, ―MALLOC‖, ―NEW‖, etc.

these keywords can only be known to a person who knows C++.

This approach is explained in ―Figure 11‖. If Alice wants to enter into the community, let us say

she chose the community ―C++ queries‖.

She sends the request to join the community.

Alice would be presented with some random set of keywords having both the keywords

that belong to the community and the keywords that does not belong to the community.

About 30 keywords would be available to Alice and out of these 30 she has to pick the

correct keywords.

Alice will submit the chosen keywords to the system and the system would automatically

check for the correct keywords.

If the number of correct keywords from Alice exceeds a predefined threshold T, the

system will reply to her with ACCEPT message or else REJECT messages is sent to her.

35

Figure 11: Our Approach for a User to join into the community

3.8 Method

In this section we elaborate on the method that we used for our improvised final approach. A

method to test the knowledge of the new user and his willingness to join the community will be

described.

Social networking sites like Facebook, MySpace, LinkedIn, Orkut etc. have lots of communities.

Some of them are private, few of them are secret and the rest are public. The privacy concerns of

the moderators of private and secret communities are high. The moderators of a public

community publish the posts of their community openly. To maintain the integrity among the

members and to maintain the quality of the posts of the private communities, moderator has to

take extra care of the members who join his community.

Many time users join a community just by getting attracted by the exciting name of the

community. These types of users are neither interested in the activities of the community nor do

they know about the community. Our basic idea is to filter out the users who are not interested in

the community. This filtering is done at the time when the user tries to join the community.

The following are two types of users who would send the request to join a community.

A user who is interested in the community.

A user who is not interested in the community.

Here the underlying assumption for this classification of the users is that there is a considerable

difference in their knowledge levels about the community. Our model works on the knowledge

based authentication scheme. The model of knowledge based group is as follows.

36

3.8.1 Moderator’s side

The moderator creates the community by giving it a name, category and a brief description.

Moderator should create at least 25 keywords/tags that belong to the group.

The more the number of keywords the more it is difficult for an attacker to guess.

He has to take care that the keywords are semantically disjoint.

―Figure 12‖ is a screenshot of the above described steps.

Figure 12: Moderator creating the community and entering keywords

3.8.2 User’s side

The user who wants to join the community is given a grid of 30 keywords.

Out of the keyword only 7 belong to the community. The rest of the keywords are random

keywords which belong to the other communities of the same category.

The user has to choose the correct answers.

37

―Figure 13‖ is the screenshot of the above described steps.

Figure 13: User choosing the keywords corresponding to the community

3.8.3 System’s side

The threshold T of correct answers for any community that should be given will depend on

the degree of flexibility that the moderator desires.

We took threshold T in between (TA-5) % and (TA+5) % as the secure zone for the threshold.

Where TA is the average score of all the users in the user study.

For a user who is not interested in the group it would be very difficult to enter into the community

clearing the knowledge based test. If the user is really interested to join the community then he

has to learn the basics about the community and pass the test. The threshold T is taken near the

average scoring value of a particular community it is adaptive to the changes in the scoring value

of all the members of the community.

3.9 Security Analysis

Communities are vulnerable to the attacks of the malicious user. Recently attackers have turned

to the communities of the social networking sites as there are easy to get into and many of the

users are least concerned about the attacks in the communities. Majority of the social networking

site users are unaware of the threats in the communities.

In Social Networking Sites, a malicious user of the community is a user who tries to exploit the

resources of the Community or disturbs the activities of the community. The disturbance can be

in the form of spam messages, propagation of malware, interruptions and random posts in the

discussions, sending false data etc. For Example: All the community members might receive mail

38

from attacker saying that ―Hi, you have won a lottery of $1000. Please click the below link to

complete the procedure.‖

Attackers can be broadly classified as:

1. Passive Attacker: A passive attacker is a stealthy attacker who joins the community and

steals the user‘s data if the user‘s profile is kept public. He can study the trends in the

communities by checking the kinds of discussions that are going on. It is difficult to

detect a passive attacker.

2. Active Attacker: An Active attacker after joining the community will disturb the

community regularly in the form of spam messages, posting malicious links and

uploading malicious applications.

For any kind of attacker the first thing that he needs is the membership of the community.

Whenever an attacker wants to join the community, our approach says that he has to choose the

correct keywords. Attacker can try to compromise the knowledge test of the community which is

built on the basis of our approach. Explained below are the different ways the attacker may try to

bypass the test:

The attacker may guess the correct keywords by elimination. He may not know keywords of

the joining community but may know that the other randomly given keywords do not belong

to the joining community.

This attack depends on the attacker‘s diverse knowledge. The randomly given keywords were

limited to the same genre of the joining community thus minimizing this attack considerably. The

purpose of introducing random keywords from the same genre was to create confusion in the

attacker who is trying to guess the correct keywords by elimination.

Another technique by which the attacker can try to crack the knowledge test of the

community is by guessing all the keywords given to him.

In our implementation we have not given a chance for the attacker to choose all the keywords.

We have asked all the users to choose not more than required keywords.

Another kind of attack is where the attacker tries to guess the keywords given the limitation

to choose only certain number of keywords.

39

This guessing attack is similar to brute force attack. The probability that the attacker will succeed

to break the knowledge test is very low.

From the above discussion we can infer that bypassing the community knowledge test is difficult

and the communities based on our approach are safe from the attackers.

3.10 User Study

We conducted a User Study to prove the correctness of our proposed method for communities.

For the user study we took about 54 volunteers out of them 20 were female and 34 male

volunteers. The educational and professional background of the participants was diverse. The age

group of these volunteers varied from 19 years to 31 years. Monetary incentives were provided

for the pilot study to avoid the cold start. The aim of the User Study was to statistically measure

the strength of our hypothesis. As a part of the user study we took feed back from the user about

their opinion on the system‘s usability.

The procedure of the user study involved three sessions which were completed in a single day by

the participant. The gap between the sessions was of one hour. The details of the sessions are

shown in Table 4.

Table 4: Details of the sessions

Session 1 Training and community setup

Session 2 Test

Session 3 Questionnaire

We first mailed the web based prototype to the participants. Detailed instructions on the usage of

the system were given in the mail. In the first session, few randomly selected users (30

participants) created communities by adding a name, brief description and list of keywords.

Adding the keywords was on personal basis. However, we filtered the keywords to ensure that all

of them are valid and make sense for the particular community.

In the second session, all the participants took the knowledge test by picking two communities:

one in which he is interested in and another community in which he is not particularly interested

in.

40

Figure 14: Graph showing difference in average scores of Interested and not interested

users

The keywords of the community were asked in a format of a quiz to user, so that even the user

would be interested to know how much knowledge is he having in a particular community. The

user could only select 7 keywords out of the given 30 keywords. Figure 14 shows the difference

between the interested and not interested people.

3.11 Results

All the participants successfully completed the above mentioned sessions. On an average, users

interested in the community answered the knowledge test with 73.42% accuracy. The average

accuracy of the users who were not interested was 36.74%. There was a clear 36.68%

demarcation in the scores of the interested and not interested users. These results are shown in

Table 5.

Table 5: Scores of the knowledge test

User Interested in community?

Knowledge Test Score

Yes 73.42%

No 36.74%

0

10

20

30

40

50

60

70

80

Interested Not Interested

Interested

Not Interested

41

In addition, existing loyal users of the community were asked to enter more related keywords.

The reply to this request was optional. Reputation of the user was judged by the quality of his

posts and his contribution to the community.

If the user has given the keywords those keywords would be combined with those of the other

keywords and would be added to the database of the keyword. So we are even asking the user to

share the knowledge which he knows about the group which others might not know in the group.

Figure 15: Graph showing the differences in the user's scores in knowledge test

Since by sharing knowledge and having the interactions regularly the group would be active.

Every person who wants to join the group would get a random set of keywords which would vary

as much as the number of keywords in the database increases.

3.12 System Usability Test

We performed a usability study based on the system usability scale. The System Usability Scale

(SUS) is a simple, ten-item scale giving a global view of subjective assessments of usability.

According to the standards suggested by ISO 9241-11 the measures of usability should cover the

following:

Effectiveness.

Efficiency.

User Satisfaction.

0

20

40

60

80

100

120

1 4 7 10 13 16 19 22 25 28 31 34 37 40 43 46 49 52

42

System usability scale is a Likert scale. In this scale a statement is made and the user is asked to

give score for the statement ranging from 1 to 5. Scores for the statements were 5 point scale

options which represented the degree of agreement or disagreement of the user. The opinion of

the user can be from strongly disagree to strongly agree. The scores used for the opinion of the

user are given below:

1. Strongly Disagree.

2. Disagree.

3. Neutral.

4. Agree.

5. Strongly Agree

A set of 10 standard questions were asked to the users to evaluate the usability of the system in

the form of feedback. The usability score (US) of the system is calculated using the formula

below:

Where

The study was done on 54 users. We got the System usability score of 67.52 out of 100. We

consider that this score is good enough for a first time user. Figure 9 shows a 3-point scale

plotting of the opinion results on the system given by the users.

Figure 16: Bar graph showing opinion of users on 3 point scale in System Usability Test

43

Figure 16 shows the answers of some of the questions that the user have answered and how many

users have agreed that Sammelan is easy to use and also it how many people are confident of

using Sammelan. Means they feel more secure after using Sammelan.

44

Chapter 4

Conclusion

The popularity of social networking sites is rapidly increasing day by day and also the privacy

concerns in the social networking sites are increasing. The main aim of the thesis is to work on a

few of the many privacy issues that are present in the social networking sites. In the thesis we

have shown two main privacy problems that social networking sites are facing in the present

situation.

The first one is eliminating the strangers from the friends list or in other words we can

say that only one‘s close friends would be added into one‘s friends list.

The second problem is on the communities and how the community membership is given

to a user. The second problem describes about how only interested people can be added

into the community and others can be neglected. Since the users who are not interested

would be inactive in the community.

The first problem is eliminating the strangers from adding themselves into the friends list or

differentiating between the strangers and the friends. The most important part of the social

networking site is the Friends we choose to add into our friends list. The privacy of our data will

be dependent on what type of friends we choose. If we choose the friends who are well known to

us then our personal data is safe, but if a stranger comes into the friends list then our personal data

can be misused. For this we investigated a novel idea of verifying friends from strangers using a

challenge response schemes. In addition, we have used the big five personality traits to make it

more difficult for a stranger to identify the user. We have showed the consequences of adding a

stranger without verifying and possible applicable countermeasures. Results of the user study

show that our proposed approach provides a viable option to privacy management in social

networks.

The second problem discusses about the type of people who are joining the communities in social

networking sites. The part of communities in the activities of social networking sites is very

crucial. Since communities reflect the user‘s personality and characterize him, security and

privacy in online communities is of prime importance. To solve the community membership

45

problem, we use a knowledge based test. We have shown that people who do not belong to the

community can be filtered effectively. Our results show that the scores of knowledge test of

interested and non-interested users are substantially far too off enable us in differentiate them. In

future one may try to modify the approach as to further reduce the effort of the moderator in

creating the communities.

One can extend our present work for the first problem by differentiating between the friends or

the type of friends. For example: what type of friends are they such as school friends, college

friends, colony friends, colleagues etc.

For the second problem a possible extension would be to automate the process of keyword

generation in the community creation process. The effort of the attacker to bypass the test is not

quantized in our process. It is worthwhile to quantize the same.

46

References

[1] Ahn,L. von, Maurer,B. McMillen, C. Abraham, D and M. Blum. reCAPTCHA: Human-Based

Character Recognition via Web Security Measures. Science, September 2008.

[2] Arguello, J., Butler, B. S., Joyce, E., Kraut, R., Ling, K. S., Rosé, C., and Wang, X. 2006. Talk to me:

foundations for successful individual-group interactions in online communities. In Proceedings of the

SIGCHI Conference on Human Factors in Computing Systems (Montréal, Québec, Canada, April 22 -

27, 2006). R. Grinter, T. Rodden, P. Aoki, E. Cutrell, R. Jeffries, and G. Olson, Eds. CHI '06. ACM,

New York, NY, 959-968. DOI= http://doi.acm.org/10.1145/1124772.1124916.

[3] Baron, AR and Byrne,D. Social Psychology. Eight edition. Prentice-Hall India.

[4] Bilge, L., Strufe, T., Balzarotti, D., and Kirda, E. 2009. All your contacts are belong to us: automated

identity theft attacks on social networks. In Proceedings of the 18th international Conference on World

Wide Web (Madrid, Spain, April 20 - 24, 2009). WWW '09. ACM, New York, NY, 551-560. DOI=

http://doi.acm.org/10.1145/1526709.1526784.

[5] Bonhard, P., Harries, C., McCarthy, J., and Sasse, M. A. 2006. Accounting for taste: using profile

similarity to improve recommender systems. In Proceedings of the SIGCHI Conference on Human

Factors in Computing Systems (Montréal, Québec, Canada, April 22 - 27, 2006). R. Grinter, T.

Rodden, P. Aoki, E. Cutrell, R. Jeffries, and G. Olson, Eds. CHI '06. ACM, New York, NY, 1057-

1066. DOI= http://doi.acm.org/10.1145/1124772.1124930.

[6] Boyd, Danah M. & Ellison, Nicole B.: Social Network Sites: Definition, History, and Scholarship In:

Journal of Computer-Mediated Communication , Vol. 13 , Nr. 1 , October (2007) , S. article 11 .

[7] Butler, Brian S. Membership Size, Communication Activity, and Sustainability: A Resource-Based

Model of Online Social Structures. In: Journal of Information Systems Research, Vol. 12, No. 4,

December 2001, pp. 346-362 DOI: 10.1287/isre.12.4.346.9703.

[8] Community Hacking - A New Threat to the Internet. http://hubpages.com/hub/Community-Hacking-

social-bookmarking-sites-online-communities-social-networks.

[9] Facebook‘s New Privacy Problem: Groups Created Under False Pretenses.

http://blog.searchenginewatch.com/081222-080553.

[10] Facebook Statistics. http://www.facebook.com/press/info.php?statistics.

[11] Felt, A., and Evans, D. Privacy Protection for Social Networking Platforms. In Proceedings of Web 2.0

Security and Privacy. Oakland, CA. 22 May 2008.

47

[12] Grimmelmann JT. Facebook and the social dynamics of privacy. In: Iowa Law Review 95(4), May

2009, to appear. http://ssrn.com/abstract=1262822 (last access 17 October 2008).

[13] Gross, R., Acquisti, A., and Heinz, H. J. 2005. Information revelation and privacy in online social

networks. In Proceedings of the 2005 ACM Workshop on Privacy in the Electronic Society

(Alexandria, VA, USA, November 07 - 07, 2005). WPES '05. ACM, New York, NY, 71-80. DOI=

http://doi.acm.org/10.1145/1102199.1102214.

[14] Jakobsson, M., Stolterman, E., Wetzel, S., and Yang, L. 2008. Love and authentication. In Proceeding

of the Twenty-Sixth Annual SIGCHI Conference on Human Factors in Computing Systems (Florence,

Italy, April 05 - 10, 2008). CHI '08. ACM, New York, NY, 197-200. DOI=

http://doi.acm.org/10.1145/1357054.1357087.

[15] Joyce, E and Kraut, R. Predicting Continued Participation in Newsgroups. Journal of Computer-

Mediated Communication, 11(3):723--747, 2006.

[16] Just, M. 2004. On the Design of Challenge Question Systems. IEEE Security and Privacy 2, 5 (Sep.

2004), 32-39. DOI= http://dx.doi.org/10.1109/MSP.2004.80.

[17] Just, M. and Aspinall, D. 2009. Personal choice and challenge questions: a security and usability

assessment. In Proceedings of the 5th Symposium on Usable Privacy and Security (Mountain View,

California, July 15 - 17, 2009). SOUPS '09. ACM, New York, NY, 1-11. DOI=

http://doi.acm.org/10.1145/1572532.1572543.

[18] Lampe, C. A., Ellison, N., and Steinfield, C. 2007. A familiar face(book): profile elements as signals in

an online social network. In Proceedings of the SIGCHI Conference on Human Factors in Computing

Systems (San Jose, California, USA, April 28 - May 03, 2007). CHI '07. ACM, New York, NY, 435-

444. DOI= http://doi.acm.org/10.1145/1240624.1240695.

[19] Lucas, M. M. and Borisov, N. 2008. FlyByNight: mitigating the privacy risks of social networking. In

Proceedings of the 7th ACM Workshop on Privacy in the Electronic Society (Alexandria, Virginia,

USA, October 27 - 27, 2008). WPES '08. ACM, New York, NY, 1-8. DOI=

http://doi.acm.org/10.1145/1456403.1456405.

[20] Malware and Spam rise 70% on social networks.

http://www.sophos.com/pressoffice/news/articles/2010/02/security-report-2010.html.

[21] Maslow, A.H. A Theory of Human Motivation, Psychological Review 50(4) (1943):370-96.

[22] Nov, O. and Wattal, S. 2009. Social computing privacy concerns: antecedents and effects. In

Proceedings of the 27th international Conference on Human Factors in Computing Systems (Boston,

48

MA, USA, April 04 - 09, 2009). CHI '09. ACM, New York, NY, 333-336. DOI=

http://doi.acm.org/10.1145/1518701.1518754.

[23] Oliver P. John and Srivastava, S. The Big-Five Trait Taxonomy: History, Measurement, and

Theoretical Perspectives. In Handbook of Personality: Theory and Research (1999), pp. 102-138.

University of California at Berkeley.

[24] People Don‘t Read Privacy Policies… But Want Them To Be Clearer.

http://www.techdirt.com/articles/20090216/1803373786.shtml.

[25] Rabkin, A. 2008. Personal knowledge questions for fallback authentication: security questions in the

era of Facebook. In Proceedings of the 4th Symposium on Usable Privacy and Security (Pittsburgh,

Pennsylvania, July 23 - 25, 2008). SOUPS '08, vol. 337. ACM, New York, NY, 13-23. DOI=

http://doi.acm.org/10.1145/1408664.1408667.

[26] Sasse, M. A., Brostoff, S., and Weirich, D. 2001. Transforming the 'Weakest Link' — a

Human/Computer Interaction Approach to Usable and Effective Security. BT Technology Journal 19,

3 (Jul. 2001), 122-131. DOI= http://dx.doi.org/10.1023/A:1011902718709.

[27] Social Networking ―SPAM‖. http://en.wikipedia.org/wiki/Social_networking_spam.

[28] Social Networking Sites for Grown–ups.

http://www.symantec.com/norton/products/library/article.jsp?aid=social_networking_sites_for_grown

_ups.

[29] The psychology of Facebook

http://www.charlatan.ca/index.php?option=com_content&task=view&id=20014&Itemid=151.

[30] Toomim, M., Zhang, X., Fogarty, J., and Landay, J. A. 2008. Access control by testing for shared

knowledge. In Proceeding of the Twenty-Sixth Annual SIGCHI Conference on Human Factors in

Computing Systems (Florence, Italy, April 05 - 10, 2008). CHI '08. ACM, New York, NY, 193-196.

DOI= http://doi.acm.org/10.1145/1357054.1357086.

[31] Vaidya, J. and Clifton, C. 2004. Privacy-Preserving Data Mining: Why, How, and When. IEEE

Security and Privacy 2, 6 (Nov. 2004), 19-27. DOI= http://dx.doi.org/10.1109/MSP.2004.108.

[32] Why social networking spam reaps more rewards than e-mail.

http://www.allspammedup.com/2009/11/why-social-networking-spam-reaps-more-rewards-than-

email/.

49

[33] Yardi, S., Feamster, N., and Bruckman, A. 2008. Photo-based authentication using social networks. In

Proceedings of the First Workshop on online Social Networks (Seattle, WA, USA, August 18 - 18,

2008). WOSP '08. ACM, New York, NY, 55-60. DOI= http://doi.acm.org/10.1145/1397735.1397748.

[34] Yu, H., Gibbons, P. B., Kaminsky, M., and Xiao, F. 2008. SybilLimit: A Near-Optimal Social Network

Defense against Sybil Attacks. In Proceedings of the 2008 IEEE Symposium on Security and Privacy

(May 18 - 21, 2008). SP. IEEE Computer Society, Washington, DC, 3-17. DOI=

http://dx.doi.org/10.1109/SP.2008.13.

[35] Yu, H., Kaminsky, M., Gibbons, P. B., and Flaxman, A. 2006. SybilGuard: defending against sybil

attacks via social networks. SIGCOMM Comput. Commun. Rev. 36, 4 (Aug. 2006), 267-278. DOI=

http://doi.acm.org/10.1145/1151659.1159945.