Online identity: Giving it all away?

i n f o rma t i o n s e c u r i t y t e c hn i c a l r e p o r t 1 5 ( 2 0 1 0 ) 4 2e4 6

ava i lab le a t www.sc iencedi rec t .com

www.compseconl ine . com/publ i ca t ions /prod in f .h tm

Online identity: Giving it all away?

S.M. Furnell

Centre for Security, Communication and Network Research, University of Plymouth, Plymouth, United Kingdom

E-mail address: [email protected]/$ e see front matter ª 2010 Elsevdoi:10.1016/j.istr.2010.09.002

a b s t r a c t

With a wealth of personal data now residing across various locations online, individuals

can find themselves at increasing risk of too much information being exposed. This in turn

may increase the potential for threats such as cyber-snooping, social engineering, and

identity theft based upon the gathered details. In many cases the exposure occurs as

a result of what individuals directly post about themselves on social networks and blog

sites, whereas in some cases it happens thanks to other people posting things beyond their

control. This paper examines the potential risks and some of the routes by which infor-

mation might be harvested. It then proceeds to consider some of the potential conse-

quences, presenting examples of how people can be duped using freely available

information and how willingly they appear to expose it to others. Recognising the ease of

online search, and the difficulty of reigning back information once it is exposed, the

requirement is clearly to improve user awareness and control over their data in the first

instance.

ª 2010 Elsevier Ltd. All rights reserved.

1. Introduction personal details. Some of the resultant implications are then

There are now significant opportunities for identities to be

reconstructed from the fragments of information that can be

foundacross various online sites and services. Careless and ill-

considered disclosure of information online can consequently

end up presenting a relatively unobstructed view for snoopers,

tricksters and thieves. Moreover, much of the details that we

find ourselves required to disclose as part of normal activities

(e.g. registration on websites) can then sit there as a potential

target for later harvesting (especially if, as in some sites,we are

required to configure associated privacy options correctly in

order to keep details from being on general view).

This paper considers the potential risks introduced by the

over-exposure and easy accessibility of personal data. The

discussion begins with an overview of selected threat cate-

gories, before proceeding to consider more specific aspects of

theproblem including the ease of locating potentially sensitive

data, the frequent vulnerability of users, and their tendency for

exacerbating this vulnerability by careless sharing of their

k.ier Ltd. All rights reserve

examined in terms of compromising other security controls

and the difficulty in regaining control of personal information

once it has been made available online.

2. Are we telling too much?

One of Berners-Lee’s original visions for the web was an envi-

ronment in which individuals could publish information as

easily as they could read it. While this did not immediately

materialise in the formative years (with farmorepeople simply

being consumers of content), more recent times have seen the

arrival of what became known as Web 2.0, with blogs, social

networks, wikis and other outlets for interactive information

sharing. With these sites and services now present in abun-

dance, many people have taken the opportunity to share their

thoughts, ideas, and indeed their lives with the rest of the

onlineworld.A good example here is providedby the growthof

social networks. The most popular, Facebook, now has in

d.

mailto:[email protected]

http://www.sciencedirect.com

http://www.compseconline.com/publications/prodinf.htm

http://dx.doi.org/10.1016/j.istr.2010.09.002



i n f o rm a t i o n s e c u r i t y t e c hn i c a l r e p o r t 1 5 ( 2 0 1 0 ) 4 2e4 6 43

excess of 500 million users (Facebook, 2010), and according to

figures from June 2010 users in both theUS and theUK are now

spending an average of over 6 h per month using the service

(representing an increase of almost 30 min per month for the

UK users and over 80 min for the US subscribers compared to

a year earlier) (BBC, 2010a). Unfortunately, however, this extra

time has not been spent learning how to protect the personal

information being disclosed. As a result, a significant risk is

that people can end up saying too much.

Here, then, is the downside. Although the opportunity

offered byWeb 2.0 and beyond is immensely empowering, the

resulting proliferation of personal and professional informa-

tion may inadvertently lend support to a variety of undesir-

able scenarios. For example, each of the following threats can

be significantly assisted by what gets posted online by (or

about) the subsequent victims:

� Snooping and cyber-stalking. Here theavailability of information

about individuals can present opportunities for unwanted

attention. For example, as a baseline there is a chance plain

and simple nosiness, with would-be snoopers being given

access to far more information than they really ought to be

entitled tosee.Atamoresinister level, thesameroutesmaybe

used to spy upon people, with details being exposed that they

might prefer to protect. As a simple experiment from the

nosiness perspective, readers could try a search that targets

old school friends, ex-workmates or current colleagues just to

see the variety and volume of information that can be

returned.

� Social engineering. In this context, the availability of infor-

mation can be leveraged towards the exploitation of a target

user. The things that people post online (or the things that

others post about them) can give numerous insights into

their background and activities, which might then be used

to trick them into accepting a subsequent request or

instruction from an attacker (be it online, by phone or in

person). In many cases, the potential to avoid being duped

rather depends upon people remembering what they have

made publically available, and not being convinced of

someone else’s legitimacy just because they can quote such

details back to them. Those who blog and tweet their every

move may still be surprised at what other people know

about them as a consequence.

� Identity theft and identity fraud. If enough information is out

there then it can be pieced together to provide a sufficient

basis for masquerading as the legitimate user. Acquiring

a home address, a date of birth and amother’smaiden name

(all of which may be available with a bit of searching), could

be sufficient to put someone in a position to take over an

account or start the process of applying for new credit in the

victim’s name. The problem has grown significantly in

recent years, and the threat has undoubtedly been amplified

by the wealth of information that is up for grabs online,

either directly or via scams such as phishing. For example,

findings from Javelin Strategy & Research suggest that over

11 million consumers fell victim to identity fraud in 2009,

compared to almost 10million in 2008 (although it is notable

that traditional methods such as physical theft still

accounted for nearly twice as many known cases as online

attacks) (Miceli and Kim, 2010).

With these problems in mind, it is worth considering how

information might be gathered, and what may be out there to

be found.

3. Digging around

Would-be data scavengers have a number of tools at their

disposal. Themostbasic is the searchengine,which canquickly

reveal a wealth of details (as well as associated images should

the latter benecessary for the intendedscam). Inaddition, there

are more specialised ‘people search’ tools (e.g. Pipl and 123

people) that specifically aim to deliver results about target

individuals. As an example, Fig. 1 shows the results of a Pipl

search for me (using details that you might easily gather, or at

least assume, from the author and affiliation details of this

paper,namely ‘StevenFurnell’ living in ‘Plymouth,GB’).Muchof

the returned information does indeed relate tome (with the list

going farbeyondthatshown inthescreenshot), albeitwithmost

of it relating to my professional activities and the only notable

personal link returned being to anold entry on the electoral roll.

Nonetheless, Pipl proved able to draw together quite relevant

summary from other sources that someone would otherwise

have needed to harvest manually.

Of course, as an academic with a variety of publications

and other external facing activities, the findings from a self-

search are not likely to be representative of what would be

found for others. With this in mind, I thought I would test the

effectiveness using the name of my best friend from school,

whom I have not seen for some 20 years. From a quick Pipl

search using just his name (a very common first name and not

a particularly uncommon surname) and the country, I was led

directly to a link showing his entire family tree, publicly

available online. This included dates of birth, dates of

marriage,maiden names, sibling namese in short, a wealth of

details that could help to inform identity theft or other forms

of masquerade attacks (especially with a variety of these

details being used in scenarios such as identity verification

and password recovery questions, as discussed later).

The family tree example highlights the fact that users do

not necessarily end up exposing themselves, but can easily find

that other people have done it for them. There are numerous

other examples that can illustrate this problem. For instance,

we can consider the various people who elect to protect their

privacy by entering a false date of birth on social networking

sites, only to find their efforts undone when a slew of close

friends then decide to post best wishes on their actual

birthday. Similarly, people who have avoided putting photos

of themselves online can find this undermined when friends

and acquaintances still tag them in photos that they have

uploaded. Meanwhile, people can also find themselves at risk

from bloggers, and from personal experience I can recount

instances ofmy travels and the birth ofmy son (completewith

photo) having been reported on other people’s blogs without

any prior checks being made with me.

In addition to opportunities to go after specific individuals,

there can also be opportunities to harvest details about

multiple users from specific sites. As an example, the news

that the aforementioned Facebook service had reached 500

million active users was soon followed by news that details of



Fig. 1 e The results of a Pipl search using the author’s name and location.


100 million of them had been collected and posted online in

a single downloadable list (compiled by a security consultant

using an automated script in order to highlight the privacy risk

of the social networking site) (Emery, 2010). Although the

collated details were all from publicly visible profile data,

commentators observed that many users expose too much

data in this context because of their difficulty with privacy

settings. Indeed, Facebook had already faced earlier contro-

versy thanks to changes it made to its privacy options (pre-

senting users with highly granular controls that many were

unable to understand or configure correctly), with the conse-

quence that many users ended up inadvertently exposing

their profile data to a wider audience than they intended (BBC,

2010b). While the published list did not directly include more

obviously personal information such as email, phone and

postal addresses, it did include the name, ID and URL of every

searchable user’s profile; enough then to provide a summary

of all the potentially exploitable profiles all in one place.

4. Vulnerable targets

As previously indicated, one of the significant risks resulting

from the abundance of available information is the recon-

naissance opportunity that it offers to social engineers. Here

the gathered details are then used in an attempt to convince or

persuade the victim of the legitimacy of the person contacting

them (i.e. cultivating the assumption that they must be

genuine because they seem so well-informed). With this in

mind, it is clear that by lettingattackers knowsomethingabout

them, individualsbecomepotentiallyvulnerable tocraftedand

targeted social engineering attempts. For example, informa-

tion taken from a web page describing their personal interests

may be exploited to promote trust based upon what Cialdini

(2000) refers to as ‘liking and similarity’ (i.e. where the

attackerexploits the fact that targetsaremore likely to respond

to someone they like, or perceive to be similar to themselves;

aspects that could clearly be mimicked if related information

about the target’s interests and the like can be found online).

Part of the problem is that people posting their details

online only tend to think about the reasons theywant to share

the information and the type of people they want to share it

with. Significant eventualities may remain unforeseen in

advance, as they simply do not recognise the potential for

misuse if the same data was to sit in different hands. It is

worth noting that this can often present a risk to the indi-

vidual and the organisation in equal measure. The latter can

most occur when someone is posting information about their

employment, which then (for example) conveys details that

might allow them to be targeted from that perspective. In

addition, they might also divulge details that the organisation

would prefer not to have shared (e.g. details of workplace

problems, or other internalmatters). So, when looking at what

someone has blogged or put on their social network page

about their work, does it look like they have applied the litmus

test of ensuring that they do not say anything more about

themselves, their role and activities within the organisation

then could be found on their employer’s own website? In

many cases the answer will be ‘no’, and this consequently

represents one area in which people can usefully be made

more aware of data value and exploitability.

Prior research has suggested that users are extremely

susceptible to exploits that are able to leverage even a small

amount of knowledge about their employer and/or workplace.

For example, a study conductedby researchers at theUniversity

of Plymouth targeted staff within a participating organisation

and used freely available information about the organisation to

fabricate a message claiming to be from their IT department.

This requested their cooperation to download a claimed soft-

ware update, and therefore served to test whether the users

concerned were vulnerable to accepting and running malware

that may well have been distributed using the same approach.



i n f o rm a t i o n s e c u r i t y t e c hn i c a l r e p o r t 1 5 ( 2 0 1 0 ) 4 2e4 6 45

Having sent the message to over 150 people, 23% were wit-

nessed to have attempted the download within the first three

and a half hours (Furnell and Papadaki, 2008.). The fact that the

message to staff was deliberately worded and presented in

a manner that ought to have raised suspicion (and yet snared

a much greater number than apparently reported it to the IT

department) lends further weight to the assertion that many

users will require very little convincing in order to be converted

from potential to actual victims.

5. Sharing without caring

Some people not only put too much information online, they

also do things that increase their own risk of exposing it. For

example, many users on social networking sites will admit to

having accepted friend requests from relative (or total)

strangers; andhaving shared their full profile datawith them in

theprocess. The practice is perhapsmost pronouncedamongst

youngpeople,where theobjective isoftennot tomaintainor re-

establish contactwith friends, but rather to collectmore online

acquaintances than their peers. Thus, issuing or accepting

friend requestswith strangers is innowayunusual andhelps to

enhance their apparent popularity rating. Indeed, a poll con-

ducted amongst a group of 16e17-year-old pupils at an infor-

mation security workshop revealed that 59% had invited

someone they had nevermet to become their friend on a social

network, and 47%had accepted a similar request received from

anonline stranger (Furnell, 2008). In addition, 28%had used the

site to find things about other people (e.g. personal information

or details of their activities), illustrating that if the opportunity

for snooping is there thenmany people will take it.

In order to get a more practical measure of people’s

tendency to give their data away, a further study from the

University of Plymouth set out to determine the ease with

which such data could be gathered (Phippen et al., 2009). The

approach involved issuing friend requests to users on a social

networking site in order to determine howmanywould accept

such a request from a stranger and thereby share their own

personal data. In order to provide a basis for inviting others,

fake social network profiles were created for two characters,

one male and one female, including a range of false personal

information.While details such as phone number and address

were checked to ensure that they were not real, the name of

a real school was used as this enabled the social networking

site to automatically suggest other potential friends who had

attended the same place. With the fake profiles set to private

(i.e. so that only people who were friends with them could see

any significant details beyond the name and photo), invita-

tions were then sent to 100 people per profile (50 to males and

50 to females). In order to improve the chances of soliciting

interest, provocative pictures were assigned to both profiles

(specifically, one depicting a bare chest and six-pack for the

male and a large cleavage for the female). Neither photo

included any headshot, and so all that the inviteeswould have

seen would have been invitations from someone seeming to

have attended the same school as them.

After amonth, the 57 of the requests issued on behalf of the

female character had been accepted (48 of these acceptances

occurring within the first 24 h), compared with 15 for the male

character. Notably, there were no reminders following the

original invitation, andboth achieveda fairlywell balancedmix

of friends of each gender. As a consequence of the accepted

invitations, the new friends were all sharing potentially harm-

ful personal details with complete strangers. Themajority was

divulging birthday, relationship details, photos, activities and

interests, and email contacts, with small proportions also

sharing physical addresses, telephone numbers and employ-

ment details. It is also worth noting that in addition to the

accepted invitations, the female character also received 29

friend requests of her own, suggesting the apparent ease with

which users can be taken in by a good social engineering hook.

6. Answers to personal questions

Looking at the above results, some users would still see little

problem, in the sense direct security details such as account

codes and passwords are not being disclosed. Indeed, it is

sometimes difficult to appreciate why sharing a few tidbits of

personal information canbeall that problematic. Thedifficulty

is, of course, is the different contexts in which such informa-

tion now gets used. For example, and as mentioned earlier,

personal details are often employed as the basis for questions

used in website password recovery and reset procedures. The

risk here ultimately comes down to a combination of howwell

the security questionshave been chosenandhowmuch canbe

found out about the user in order to answer them. In many

cases such challenges ask the user for one or more items of

personal information, and so there is clearly a risk if the related

answers can be researched from sources such as social

networks, blogs, and genealogy services.

Choosing the questions that should form the basis of the

challenges has always merited special consideration, in order

to avoid basing them around information that would be

known or easily discoverable by someone else, as well as to

ensure that they require answers that the user would be likely

to remember. As such, factual questions are typically prefer-

able to those based around factors such as the user’s prefer-

ences or word associations (both of which may change over

time, with the consequent risk that the user may forget how

they originally answered when registering their responses)

(Haga and Zviran, 1991). However, the factual responses are

the ones that might be more easily researched from the user’s

personal web pages etc., especially if they are prone to

disclosing too much about themselves online.

With this in mind, we can compare the two sets of ques-

tions in Fig. 2. The first set comes from eBay, and it is clear that

a number of themmight be researchable from online sources.

By contrast, the other set, taken from Windows Live, would

appear to be less susceptible to compromise in this context.

The user’s supposedly secret responses can be especially

vulnerable if questions are framed around current information

or interests, as these are more likely to be the types of details

that users are prone to divulging in their social networks and

blog pages. So, from the perspective of potential compromise,

there is a tangible difference between asking questions such

as ‘What is your pet’s name?’ and ‘Name of first pet’, with the

former having a significantly greater likelihood of being

discovered through users’ social networking postings than the



Fig. 2 e Secret question options from (a) eBay and (b)

Windows Live.


latter (unless the user concerned is posting extensively about

their life history etc.).

7. Conclusions

If sensitive data has found its way onto the web then the

chances are that it is going to stay there, even if the user that

originally posted it changes their mind about making it avail-

able. Efforts to retrospectively restoreprivacy canbe frustrated

by the persistence and sheer accessibility of online informa-

tion; indeed the very factors that are so advantageous in other

search contexts only serve to amplify the risk if something has

gone awry. So, while information can often be removed from

a site, it is far less likely to be removed from the web, and with

search engines trawling cyberspace as a whole there is a good

chance of it being relocated.

With the above in mind, the solution is to try to control

what gets out there in the first place. Control in this context

should not mean restricting or filtering what users can do,

but rather making them more aware of the onus upon them

to police and regulate their own activities. Of course, the

experiences to date clearly illustrate that many will not be in

a position to do this without support. There is certainly

a need to boost user awareness of data value when consid-

ering what and where they are willing to share. In addition,

the technology needs to make it as easy for them to protect

information as it does to publish it. This means privacy

controls that are clearly and (where possible) consistently

presented, so that users have a chance to learn good practice

and then put it to use across the range of sites they use. Thus,

as with many other aspects of security, it is the combination

of technology and awareness that holds the key to address-

ing the problem.

r e f e r e n c e s

BBC. The ups and downs of social networks. BBC News Online;2010a. 22 July 2010.

BBC. Facebook privacy settings to be made simpler. BBC NewsOnline; 2010b. 26 May 2010.

Cialdini RB. Influence: science and practice. 4th ed. Allyn & Bacon;2000.

Emery D. Details of 100 m Facebook users collected and publishedby Daniel Emery. BBC News Online; 2010. 29 July 2010.

Facebook. Statistics e Press room, http://www.facebook.com/press/info.php?statistics; 2010 (accessed 27.07.10).

Furnell S. End-user security culture: a lesson that will never belearnt? Computer Fraud & Security; 2008:6e9. April 2008.

Furnell S, Papadaki M. Testing our defences or defending ourtests: the obstacles to performing security assessment.Computer Fraud & Security; 2008:pp8e12. May 2008.

Haga WJ, Zviran M. Question-and-answer passwords: anempirical evaluation. Information Systems 1991;16(3):335e43.

Miceli D, Kim R. 2010 identity fraud survey report: consumerversion. Javelin Strategy & Research; 2010. February 2010.

Phippen A, Davey R, Furnell S. Should we do it just because wecan? Methodological and ethical implications for informationrevelation in online social networks. MethodologicalInnovations Online 2009;4(3):41e55.

http://www.facebook.com/press/info.php%3Fstatistics

http://www.facebook.com/press/info.php%3Fstatistics



Documents

Online identity: Giving it all away?