Purposive Sampling on Twitter: A Case Study

Purposive Sampling on Twitter: A Case Study

Christopher Sibona Steven WalczakUniversity of Colorado Denver, The Business

SchoolUniversity of Colorado Denver, The Business

[email protected] [email protected]

AbstractRecruiting populations for research is problematic

and utilizing online social network tools may facilitatethe recruitment process. This research examines recruit-ment through Twitter’s @reply mechanism and comparesthe results to other survey recruitment methods. Fourmethods were used to recruit survey takers to a sur-vey about social networking sites; Twitter recruitment,Facebook recruitment for a pre-test, self-selected surveytakers, and a retweet by an influential Twitter user. Atotal of 7,327 recruitment tweets were sent to Twitterusers, 2,865 users started the survey and 1,544 userscompleted it which yielded an overall completion rate of21.3 percent. The research presents the techniques usedto make recruitment through Twitter successful. Theseresults indicate that recruitment through online socialnetwork sites like Twitter is a viable recruitment methodand may be helpful to understand emerging Internet-based phenomena.

1. Introduction

Surveys are a useful and efficient tool used to learnabout people’s opinions and behaviors [7]. Surveyingtechniques have changed over the years in how theyare conducted and the modes that are available to re-searchers. Surveys can be conducted over the phone,mail, personal interview, and over the Internet. Surveyrecruitment and distribution techniques have changedover the last two decades to follow technology andapplication advancements. This case study outlines theapproach of a large survey (N = 1,544) where therecruitment was conducted on Twitter. The approachadapts Dillman et al. [7] to the constraints of Twitterwhere a survey recruitment message is limited to 140characters. A comparison is made between four differentmethods for recruitment to the survey: Twitter-basedrecruitment, Facebook-based recruitment conducted fora pre-test, a tweet about the survey from an influentialTwitter user (9,000 followers) and self-selected surveytakers who took the survey with no intervention fromthe recruiters.

Recruitment through Twitter may be appropriate formany research questions to gain insight into the behav-ior of users. There are many modes to recruit surveyparticipants and Twitter may be a viable alternative toother methods. The case study discusses the advantagesand disadvantages of using this approach and can in-form researchers about the lessons learned from thiscase study. The case study details how recruitment wasconducted for a study about unfriending on Facebook.Recruitment through Twitter will not be a replacementfor probability-based sampling techniques but may beused in an exploratory manner to understand emergingphenomena.

2. Literature Review

2.1. Survey methodology

Survey methodology identifies the principles of design,collection, processing and analysis of surveys that arelinked to the cost and quality of survey estimates [12].Technology impacts the way surveys are conductedas new modes become available. Survey modes todayinclude the phone, mail, personal interview, and overthe Internet. Dillman et al. [7] discusses several changesto the survey process from the 1960s to present day.Interviewers are less involved today than in the pastand that has lowered the sense of trust that surveyrespondents have in the survey because of the possibilitythat a survey is fake or potentially harmful. Less personalattention is given to survey respondents today becausea common mode to attain respondents is through e-mail. Potential survey respondents have more controlover access to their presence; technologies such as callerID, call blocking, e-mail filters and spam filters makesit increasingly difficult to contact respondents. Respon-dents can exercise more control over the interactionbecause more disclosures are made to respondents andthe mode in which surveys are conducted has changed tomore technology-based modes (phone, mail, Internet vs.interview) and respondents can stop the survey processby hanging up the phone, not responding to mail orsimply closing a web-browser. Response rates to in-

2012 45th Hawaii International Conference on System Sciences

978-0-7695-4525-7/12 $26.00 © 2012 IEEE

DOI 10.1109/HICSS.2012.493

3510

2012 45th Hawaii International Conference on System Sciences

978-0-7695-4525-7/12 $26.00 © 2012 IEEE

DOI 10.1109/HICSS.2012.493

3510

person interviews prior to the 1960s were often 70%to 90% and have decreased over time [7].

Members of a sample can be selected using probabil-ity or non-probability techniques [5]. Probability selec-tion is based on random selection where each memberhas a nonzero chance of selection. Random sampleseliminate the element of personal choice in selection andtherefore remove subjective selection bias [23]. Randomsampling can strengthen the belief that the sample isrepresentative of the study population because there isan absence of selection bias. Non-probability selectionsare a set of techniques where some element of choice inthe process is made by the researcher.

Sampling techniques have impacts on a variety ofattributes beyond the representativeness of the sampleincluding cost, time and accessibility considerations[5, 23]. Purposive sampling (e.g. judgment and quota)is an approach where members conform to certain cri-teria for selection. In judgment sampling a researchermay only want to survey those who meet a certaincriteria. Judgment sampling is also appropriate in earlystages of research where selection is made based onscreening criteria. Quota sampling is used to improvethe representativeness of a sample to ensure that therelevant characteristics are present in sufficient quantities[5, 23]. If the sample has the same distribution as theknown population then it is likely to be representativeof the study population. Quota sampling is often usedin opinion research because it often has reduced costsand requires less time than probability sampling [5, 23].Often researchers must make decisions from a set of im-perfect options about how conduct their research basedon the pros and cons of the options [12]. Researchersshould choose the sampling technique they employ basedon the subject population, the fit with their researchconstraints (time, money, etc.) the stage (confirmatory orexploratory) at which their research is being conductedand the intended use of the results.

2.2. Data Collection Modes

Traditional approaches to survey subjects include mail-ing paper questionnaires to subjects, calling subjectson the telephone and sending interviewers to a per-son’s home, office or gathering place to administer thesurvey face-to-face [12]. Computer-based surveys haveincreased in usage and evolved from sending computerdiscs in the mail, to email and web-based methodsover time [7, 12, 6]. The various modes have differ-ent degrees of interview involvement, interaction withthe respondent, privacy, media richness, social presenceand technology use [12]. Andrews et al. [2] examinedmethods of posting to online communities to survey

hard-to-involve Internet users. One of the reasons forincreasing the use of technology for survey data col-lection is that past research in the differences betweenpersonal interviews and telephone interviews for largerandomly selected survey respondents has shown littledifference in results [12]. The choices involve trade-offs and compromises between the various qualities andresearchers need to determine which qualities they wantand choose an appropriate method for their purposes[12].

2.3. Techniques to Increase Response Rates

Researchers have attempted to understand the generalpsychology of survey respondents to determine whysome will take a survey. Subjects can be motivated tocomplete a survey by both extrinsic and intrinsic factors.Market-based surveys sometimes focus on economicincentives for completed surveys; research into incen-tives has shown that often small token cash incentivesgiven in advance are more likely to induce a responsecompared to more substantial payments post-completion[7]. Dillman et al. [7] use social exchange theory asa framework to increase the likelihood of a response.In social exchange theory, a person’s voluntary actions(costs) are motivated by the return these actions areexpected to bring from others (rewards) [3]. A socialexchange requires trust that the other party will providethe reward in the future, although not necessarily directlyto the person.

Dillman et al. [7] provides many techniques thatresearchers can use to increase the response rate to asurvey. Dillman et al. believes that researchers shouldprovide information about the survey and how the resultswill be used to benefit them. Asking for help or advicefrom the survey respondent can increase participation;it may encourage the person to respond because thereis a social tendency to help those in need. Showingpositive regard toward the respondent can encourageparticipation e.g. the person’s opinion is important. Sup-port for a group’s values can increase the response rate,e.g. survey respondent may respond more positively tostudent surveys if the respondent supports education. Awell-designed interesting survey can increase responserates and surveys that are developed about salient topicscan induce respondents to complete a survey. Providingsocial validation that others have responded to the surveyencourages people to participate e.g. stating “join thehundreds who have already given their opinions” in therequest. Establishing trust is important in the social ex-change so respondents can believe there is some benefitto completing the survey. One way to increase trust is toshow sponsorship from a well-regarded institution, e.g.

35113511

government agencies and universities. Researchers canapply social exchange theory in multiple direct ways toencourage survey respondents to complete their survey.

2.4. Social Networks

boyd and Ellison [4] defined social network sites basedon three system capabilities. The systems: “allow indi-viduals to (1) construct a public or semi-public profilewithin a bounded system, (2) articulate a list of otherusers with whom they share a connection, and (3) viewand traverse their list of connections and those made byothers within the system” [4, p. 211]. After users joina site they are asked to identify others in the networkwith whom they have an existing relationship. The linksthat are generated between individuals become visibleto others in the local subspace. There are general socialnetwork sites (Facebook) and others that are contentfocused. Social network site LinkedIn is a network ofprofessional contacts and YouTube connects people whoare interested in video content [4].

Twitter is a social network microblogging site startedin October, 2006 [15]. Twitter is a microblogging sitedesigned around Short Message Service (SMS) andallows users to post messages limited to 140 characters[17, 15]. The terms follow and follower are used when auser subscribes to tweets from that person (i.e. A followsB when A subscribes to tweets from B and A is afollower of B) [15]. The relationship, generally, doesnot require agreement in the dyad for A to follow B;i.e. B does not need to grant permission to A so thatA can follow B under the default privacy settings [19].A minority of relationships on Twitter are reciprocal(22.1%); in this case reciprocity means that A followsB and B follows A [19]. Twitter users adopt the defaultprivacy settings 99% of the time which makes it moreopen than many social networks such as Facebook [18].

Researchers have categorized Twitter users usinga variety of methods and have developed labels fortheir classifications. Two research organizations, Krish-namurthy et al. [17] and Java et al. [15] used similarlabels to characterize three types of twitter users. Krish-namurthy et al. [17] labeled these types: broadcasters,acquaintances and miscreants or evangelists and Javaet al. [15] used similar categories with slightly differ-ently terms: Information Sources, Friends and Informa-tion Seeker. Broadcasters/information sources have manymore followers than they themselves follow. Acquain-tances/friends tend to exhibit reciprocity in the relation-ship - they respond to the tweets of those that theyfollow. Miscreants or evangelists/information seekersfollow many people and some do so with the hope thatthey will be followed back. Naaman et al. [22] classified

users into two large categories - those who talk aboutthemselves (meformers 80% of Twitter users) and thosewho inform others (informers - 20% of Twitter users).Meformers talked about themselves in 48% of theirtweets and informers provided some level of informationin 53% of their tweets.

Twitter research covers a variety of areas. Researchinterests include classification of Twitter users [17, 15,22], conversation and collaboration characteristics [13,14, 22], social network structure [17, 14, 19], the natureof trending topics and the diffusion of information [19],user ranking algorithms [19], politics [11] and privacyconcerns [18].

3. Case Study on Twitter RecruitmentProcess

3.1. Recruitment selection

The general process for recruitment to the survey onTwitter for the case study began with a search fortweets that used specific words, then screened the resultsfor suitability like language and context then sent an@reply to the user to request that the person take asurvey. The recruitment process began with screening forcertain terms that fit the research project for suitabilityand language issues. The research was to determinewhy Facebook users choose to unfriend others. In thiscase tweets were screened by using three related searchterms: unfriend, unfriending and defriend. The threeterms were used because there is not a common standardfor unfriending and there are geographical differences forthe preferred term. For example, In the United Statesunfriend is more common where in Britain defriend ismore common. Both terms have the same meaning - TheNew Oxford American Dictionary1 defines the word as:

unfriend – verb – To remove someone asa ‘friend’ on a social networking site such asFacebook.

Once the set of relevant tweets were collected theywere screened for additional attributes. The survey waswritten in English so only tweets written in Englishwere sent requests. Many languages borrow the termsunfriend, unfriending and defriend into their own lan-guage and Twitter users who wrote in different languageswere not invited to the survey. Additional screening wascompleted at the survey site itself to determine whetherthe person was 18 years old or older and whether theyhad a Facebook account.

Tweets were screened to determine whether a per-son was talking about unfriending a person not a TVshow, a U.S. State, a politician, a celebrity, etc. Tweets

1http://blog.oup.com/2009/11/unfriend/

35123512

would often indicate the Twitter user was not discussingunfriending a person but would state, “I am going tounfriend American Idol,” or “I am going to unfriendthe state of Arizona.” Twitter users who made jokesabout unfriending were less likely to be recruited -but it depended on the context of the tweet. Generallyspeaking, people were recruited with more inclusion thanless but it appeared unhelpful to recruit those who werenot talking about a specific person. During the surveya tweet that was retweeted hundreds of times stated,“Asking me to friend your dog is the same as asking meto unfriend you.” These retweets were not included inthe sample because they did not appear to be personalenough that the person was talking about unfriendinga specific person and represented more of a generalsentiment about unfriending and the nature of friend’srequest.

A tweet was sent to the person’s Twitter accountusing the @reply mechanism after the tweets werescreened for inclusion. An @reply allows a message tobe sent from one user to a specific user (or set of users)through Twitter similar to the way e-mail sends messagesto recipients. Twitter users can see who specifically sentthem a message and the contents of the message throughthe Twitter interface and a variety of other services thatwork with Twitter data (e.g. TweetDeck). See Section3 for details of the message sent to the user. Twitterhas an another method to send a person a tweet throughthe Message interface (previously called Direct Message)and the message is private and can only be seen by therecipient. Messages can only be sent to people who arefollowers of the person; i.e. A can send B a message if Bfollows A. Survey respondents did not have a previousrelationship with the researcher, were not followers of theresearch Twitter account and thus the Message facilitycould not be used. All tweets for recruitment to thesurvey were sent through the @reply mechanism.

The research did not attempt to identify Twitteraccounts who had large numbers of followers and askthose users to either tweet about the survey or retweetthe survey link sent to them. There was no attemptto screen accounts based on the number of followers,number of accounts that the person was following, tweetssent, or measures of social influence (e.g. Klout.com).Some people who received the recruitment tweet wouldretweet it on their own volition. Occasionally a personwould ask if it was acceptable to retweet the recruitmenttweet or write their own tweet about the survey. Theperson was told that they could retweet the link to thesurvey if they felt it was appropriate. When Twitter usersasked whether they could or should retweet or post aboutthe survey permission was granted. This was relativelyrare < 1% of people retweeted the survey; 63 people

retweeted the survey recruitment sent to them out of7,327 recruitment tweets. Thirty nine people decided tosend their own tweet (the users did not use the retweetmechanism provided by Twitter) to tell their followersthat a survey was being conducted about the topic andposted the link to the survey.

It is difficult to know the exact effect that others hadon recruiting people to the survey. One influential person(Klout.com score of 67 - Klout scores range from 1to 100) with approximately 9,000 Twitter followers andwhose focus is on social media retweeted the surveylink and the survey site was monitored. Approximately35 people started the survey based on the retweet and 19people completed the survey (about 54.3%) close to theaverage of the survey through the @reply mechanism(54.2%). But overall a very small percentage of thisperson’s Twitter followers came to the survey by the bestapproximation; approximately 0.4% of this influentialperson’s Twitter followers came to the survey. Thesurvey site was monitored for a two days to see if therewas an uptick in the number of people coming to thesurvey and the effect on completion rate but the peakof incoming surveys occurred within five hours of theinfluential person’s tweet. In other words, there was ashort-term increase in survey users who completed thesurvey (19) but it appears unlikely that one person’stweet is likely to have a long-term increase in surveyrecruitment.

3.2. Selecting Twitter for Recruitment

Twitter was used to recruit survey participants for severalreasons; Twitter has a large user population where themajority of users have publicly accessible messages,Twitter users had a good fit with research (social networksites), it is a simple process to contact a person onTwitter through the @reply mechanism, and the tweetscan be screened for recruitment purposes. Of those whocame to the survey from the recruitment tweet only 1.3%said that they did not have a Facebook account so thecommunity was actively engaged in the social networksite under investigation. Twitter acts as a realtime toolwhere conversations between users can occur fluidly.The recruitment tweet to the survey was always sentusing the @reply mechanism that Twitter supports sothe user could see that the recruitment tweet was madebecause of a specific message and identify that messageif needed. When users sent a question via Twitter aboutthe recruitment the researcher could promptly reply. Thisallowed the users to perceive that the recruitment was notmass spam but that the researchers targeted individualsfor recruitment based on their specific tweet.

The researcher screened tweets prior to the start

35133513

of the survey to determine whether Twitter would besuitable for recruitment. The Twitter community dis-cussed the topic of interest, unfriending, often enoughto provide a viable sample. There were, on average, 48recruitment tweets sent per day to Twitter users afterthe screening process. Of the 48 recruitment tweets,on average, 19 survey respondents started the surveyand, on average, 10 people completed the survey. Aminority of the tweets sent per day were non-recruitmentin nature, answering questions about the survey or theresearch; on average, 18 tweets were sent per day outsideof recruitment. The recruitment tweet was sent in asingle tweet of 140 characters and provided enoughinformation to the Twitter user to take the survey. It wasnot considered unduly burdensome by the researcher orthe Institutional Review Board (IRB) to send a singlerequest to the Twitter user to take a survey. While thetotal number of recruitment tweets may be perceivedas spam based on the frequency of similar messagesto users, the account was never flagged by Twitter asa spam account. Replies were made to those who hadquestions about the survey and the recruitment. Manypeople over the course of the study asked whether therewere real researchers behind the account to ascertain theveracity of the survey. A quick reply would be made toindicate that the recruitment was sincere and the researchwas genuine. All tweets were screened and replied toby a human and no automated messages were sent forrecruitment.

3.3. Twitter Profile

Twitter profiles have several components that allow Twit-ter users to identify the account. A Twitter profile canshow the following, depending on whether the Twitteruser filled in the information: the Twitter account name, aname of the account owner, a profile picture, geographiclocation, a brief description about the person or account,a URL link, an articulated list of followers, accountsthe user is following, lists that the account appears onand tweets sent. A snapshot of the profile on the lastday of recruitment is shown in Figure 1. The accountfor this study was generated specifically for recruitingto the survey so there was not an extensive list offollowers or those following the research account. Therecruitment account followed one user to indicate thatthe account was genuine and no accounts were followingthe recruitment account at the start of the survey. OtherTwitter accounts were added to the followers list sousers could message the recruitment account privately.Recruits would ask for more direct and private accessto the investigator and it was granted at the recruit’srequest. The account name was chosen with an intention

revealing name - UnfriendStudy - so that users mightperceive the intention of the account by its name alone.The account listed the real name of the researcher, alongwith the university program in which he is enrolledand the university affiliation (Information Systems PhDstudent, University name) in the description field. Thegeographic location was also given to add authenticity(city, state). The URL link in the profile was a link tothe survey.

The picture associated with the Twitter account wasselected based on the input of three judges. The picture isone of the few pieces of the profile that show up next tothe tweet message (unlike the list of followers, location,etc.). Three candid pictures were shown to three judgesand they were asked to pick the picture from which theywould be likely to take an academic survey. They wereasked to pick a picture that was friendly, academic andappeared sincere. All three judges chose the same pictureand that was the picture used for the account.

3.4. Purposive Sampling & Timing

The sampling process was intended to locate peoplewho had experience with the particular event underinvestigation. Miller and Vega [21], found that just overhalf of general population (56%) of Facebook users hadunfriended someone using a probability sample. Theresearch was being conducted with a purposive samplingrecruitment technique to find people who had unfriendedothers on Facebook. Of the survey subjects who came tothe survey 89% had said they could identify a specificperson that they unfriended and 64% said they couldremember a specific person who unfriended them. Bysearching and screening tweets on Twitter, the recruit-ment was able to identify a much higher percentage(89% vs. 56%) of users who had unfriended someoneand was able to reduce the amount of time recruitingpeople to the survey who could not provide meaningfulinput for the research. The recruitment was developedto find those who unfriended or were unfriended - notgeneral Facebook users so the researchers would notwaste survey respondent’s time and have responses thatwere meaningful to the research.

It is also helpful to recruit people to the surveywho had a recent experience with the matter for twoimportant factors: (1) Those who experienced an eventmore recently may be able to provide more accurateanswers because the event occurred recently. (2) Thosehad recently experienced an event may be more willingto take a survey about the topic because they may stillbe thinking about the topic. Experiences need to bereported immediately after they have happened in orderto be remembered [7]. This survey was self-administered

35143514

Figure 1. Twitter Account Profile

which may provide barriers to motivation and completion[7]. Without the proper motivation survey respondentsmay ignore instructions, read questions carelessly, pro-vide incomplete answers and simply abandon the survey.When the survey was pretested under observation somerecruits had a difficult time remembering a specificperson who they unfriended because the event wasnot recent even though they could remember whetherthey had unfriended someone. There were also surveyrespondents in the pretest who were uninterested inthe topic in general because they did not hold anyparticular feeling about the topic. The recruits to thesurvey through Twitter were much more engaged inthe topic of the survey and asked questions about thesurvey, the findings, and many offered their “thanks” and“good luck” to the researcher. The survey is 15 pages inlength and took, on average, approximately 15 minutes tocomplete; it is helpful to find people who are interestedin the topic to take longer surveys to increase completionrates.

3.5. Frequency of recruitment

Tweets would be screened typically every two to threehours from the hours of 7:00 AM to 11:00 PM forrecruitment purposes. Human involvement was chosenbecause the recruitment screening process was complexand there are particular timing issues involved. The pro-cess was relatively straightforward, search for key words,screen for language, and for appropriateness and send an@reply to the tweet, then respond to any @replies sentto the recruitment Twitter account. Since the screeningwas done manually it meant that the researcher wouldmonitor the Twitter account so that as messages werereceived they could be responded to promptly. Thisallowed more interactivity with the survey recruits toanswer questions that the survey respondent may have

about the survey. Answering the @replies sent to theresearch Twitter account after recruitment tweets weresent meant that individual tweets were at the top ofTwitter account. Having unique tweets at the top of theaccount meant that any survey recruit who came to theresearch profile to see who is associated with the accountwould see individualized messages were sent by a personand not an automated system.

3.6. Recruit

Recruitment Tweet: Tweets were screened for se-lection to the survey and then an @reply was sent inresponse to the tweet. Below is a sample tweet that waspicked at random and exemplifies a typical tweet seenby the researcher:

Screened tweet from Twitter user X sent to Twitteruser Y:

@Y You can always defriend on Facebook,no? You should always have the option ofcorrecting your mistakes. :P

Recruitment tweet from the research Twitter accountUnfriendStudy to Twitter user X:

@X I saw your tweet about unfriending.Ihave a survey: http://www.surveymonkey.com/s/xxxxxxxx Your input very important.PhDstdnt

The recruitment tweet was designed to follow themethodology of Dillman et al. [7] as much as possiblewithin the constraints of Twitter. Dillman et al. [7] statesthat emails for survey recruitment should include thefollowing sections: university sponsorship logo, header,etc., informative subject heading, current date, appeal forhelp, statement as to why the survey respondent was se-lected, usefulness of survey, directions on how to accesssurvey, clickable link, individualized ID (for tracking),

35153515

Figure 2. Anatomy of a Recruitment Tweet

statement of confidentiality and voluntary input, contactinformation about survey creator, expression of thanks,and indicate the survey respondent’s importance.

The recruitment tweet is limited to 140 characters socould only include a portion of the above information.Dillman et al. instructs survey researchers to keep emailcontacts for recruitment short and to the point to increasethe likelihood that the request is completely read; thiswas easy to do in Twitter, given the limitations. SeeFigure 2 for details.

If the user clicked on the link then the user wastaken to the survey site which included a cover letterthat described the survey in detail. The survey asked anadditional two screening questions to determine eligibil-ity. The survey was only open to Facebook users whowere 18 years old or over. No monetary incentives wereprovided to the survey respondents for completing thesurvey. Those who asked for the results of the researchwere told that the Twitter account would post a link tothe research results if they were accepted for publication.The Twitter users could click on the profile informationof the recruitment tweet and view the Twitter profile pageto determine the veracity of the account and survey - seeSection 3. Some users asked how they could determinewhether the URL link was safe - and how could theydetermine that independently. The research account didnot have existing relationships with those who wererecruited to the survey so independent verification can behelpful. The Twitter user would receive a reply informingthem that they could perform an Internet search on“survey” and SurveyMonkey.com is one of the top linksshown. Twitter users were also told in the reply that theresearch is, in fact, genuine.

Below is a sample interaction generated during thesurvey process where a recruitment tweet was sent and arebuttal reply was received. Ultimately, this conversationsuccessfully recruited the person to the survey (althoughunverified because the survey was anonymous).

@j I saw your tweet about unfriending.Ihave a survey: http://www.surveymonkey.com/s/unfriend-t Your input very important.PhDstudent

@UnfriendStudy I get paid to do surveys!:P

@j Yeah, I have no resources to pay youto take a survey. I do value your opinion.

@UnfriendStudy U are a good sales man!U just won my vote, mate! Brilliant!

@j What can I say, I have constraints,many people take the survey, I’m relying onpeople’s goodwill toward researchers.

@UnfriendStudy just completed it. It wasvery insightful!

@j Thanks for taking the survey. I trulyappreciate your help. I rely on people like youto take it to get insight about the issue.

Not all interactions with Twitter users went well;approximately nine people who were sent recruitmenttweets perceived that the recruitment was spam andinappropriate and sent a reply to the tweet stating theiropinion. Individual replies were made to the Twitteruser’s objection tweet in defense of the technique; how-ever, the researcher was unable to convince these ninepeople that the recruitment technique was appropriate.The nine people felt inconvenienced by the recruitmenttweet were a minority - only nine people out of 7,327recruitment tweets sent (less than 0.15% of those re-cruited) voiced objections about the survey recruitment.The person would receive an apology tweet of the form,“I am sorry if I inconvenienced you in any way. Pleaseaccept my sincerest apologies.” The Twitter users mayhave not truly accepted the researcher’s apologies andmay have doubted the researcher’s sincerity but they didnot object to the apology tweet.

Results: Four different recruitment techniques wereused to recruit subjects for the study: Facebook recruit-ment during the survey pre-test, Twitter recruitment forthe main research, through a retweet by an influentialTwitter user and Internet users who found the surveythrough no direct recruitment intervention. A summaryof the results can be found in Table 1. The study useda convenience sample for the pre-test of the surveyof Facebook participants by posting to five differentFacebook profiles and asking friends to participate in theresearch. The estimated reach of the Facebook recruit-ment is 1,305 friends based on the number of friends theprofiles had at the time the recruitment post was made.It is not known how many Facebook friends actuallysaw the recruitment posts. The Twitter recruitment wasthe main method for recruitment to the survey and wasconducted between April 17th - September 15, 2010 for

35163516

151 total days. 9,901 tweets were sent from the Twitteraccount during that period, of which 7,327 were relatedto recruitment (73.1% of the tweets were focused onrecruitment). The remaining tweets (2,574) were focusedon answering questions about the survey, respondingto tweets of thanks, questions about the nature of theresearch (was it real), etc. Many Twitter users askedwhether this research was real and did not want to takethe survey that they did not believe would help generatereal results. A response to the users would be madestating that there is a real PhD student in informationsystems conducting the research. The retweet was sentindependently by an influential person with approxi-mately 9,000 followers and a small portion of followerstook the survey. But those who did come had aboutthe same breakoff rate as those who were individuallyrecruited by the researcher.

Self-selected survey takers were users who took thesurvey on their own with no direct contact betweenthe researchers and the participants. Twitter recruitmentended on September 15, 2010 and a press-release an-nouncing the results was posted by the university onOctober 5, 2010 and subsequent media attention to thestudy was garnered. Web searches on “unfriending” orthe researcher’s name could lead a person to the survey.It is not entirely clear how self-selected responders cameto the survey but 330 people came took the survey afterSeptember 15th. These self-selected survey takers had avery high breakoff rate - 93% of the survey takers didnot complete the survey.

The research compared the known distribution of theFacebook population and compared it to the populationin the completed surveys so adjustments could be madein the subsequent analysis [23, 12] - Table 2. The surveydistribution skewed to a younger population than thegeneral Facebook population and had more female re-spondents compared to the known distribution. The anal-ysis used MANCOVA with age and gender as covariatesto remove the effect of the variables so age and genderdo not have effects on the dependent variable underanalysis. If other statistical techniques were required(e.g. multiple regression, structural equation modeling,etc.) covariates can be used to make adjustments ifthe known distribution is different from the samplepopulation. The U.S. vs. non-U.S. population resultsare significantly different from the known populationof Facebook. The non-U.S. population were recruitedfrom Tweets that used English so the distribution willbe different than the Facebook population in general.

Table 1.COMPARISON OF FOUR RECRUITMENT METHODS

Twitter Facebook-

pre-test

Self-selectedsurveytakers

RT byinfluen-

tialperson

Recruitment 7,327 1,305(reach)

330 9,000(reach)

Surveys started 2,865 135 330 35Surveys

completed1,544 91 23 19

Completion rate 21.3% 7.0% 7.0% 0.2%Start rate 39.6% 10.3% 100.0% 0.4%

Completion bythose who started

53.9% 67.4% 7.0% 54.3%

Breakoff rate 46.1% 32.6% 93% 45.7# days 151 85 193 1

Recruitmenttweets sent per

day

47.9 N/A N/A N/A

Non-recruitmenttweets sent per

day

17.6 N/A N/A N/A

Surveys startedper day

19.0 1.6 1.7 35

Surveyscompleted per

day

10.2 1.1 0.1 19

Table 2.COMPARISON OF KNOWN DISTRIBUTIONS TO SAMPLE

Collected by thestudy

Externalcomparison

groupCategory N Valid % US Facebook

Users %Age

13-17 1018-25 343 22.2 2926-34 663 42.9 2335-44 397 25.7 1845-54 113 7.3 13>55 28 1.8 7

GenderM 493 31.9 44.5F 1051 68.1 55.5

Category N Valid %Live in the U.S.A.

U.S. vs.Non-U.S.

Facebook UsersYes 1,075 69.6 30No 469 30.4 70

• Age Data from Facebook.com, July 2010. Data presented byInside Facebook http://www.insidefacebook.com/2010/07/06/facebooks-june-2010-us-traffic-by-age-and-sex-users-aged-18/-44-take-a-break-2/

• Gender Data from Facebook.com, January 2010of facebook users 18+. Data presented byInside Facebook http://www.insidefacebook.com/2010/01/04/december-data-on-facebook%E2%80%99s-us-growth-by-age-and-gender-beyond-100-million/

• Country data from Facebook. http://www.facebook.com/press/info.php?statistics

35173517

4. Limitations

Recruiting survey respondents through Twitter washighly effective for the research conducted but hasseveral limitations. The survey was conducted with pur-posive sampling which is a non-probabilty samplingtechnique. The purpose of the recruitment was to findthose who had unfriended someone or been unfriendedby someone on the social network site Facebook. Therewas no subjective measure in the sample - the researchersdid not personally know the Twitter users and Twitterusers were recruited based on objective screening criterianot other profile measures. Probability based samplingtechniques mean that there is a nonzero chance that asubject will be recruited to the sample - that is not thecase with purposive sampling [5].

There can be coverage errors - the non-observationalgap between the target population and the samplingframe with recruiting from Twitter for populations out-side of Twitter [12, 7]. The sample found that 97.1%of the people recruited from Twitter had a Facebookaccount but it is unknown how many Facebook users donot have a Twitter account so estimates about coverageare difficult to ascertain. The survey also found that88.7% of people recruited had unfriended someone onFacebook where probability sampling techniques by thePew Internet Research group found that 56% of socialnetwork users unfriended someone [20]. The surveyrecruitment was based on a tweet about unfriending soit is expected that the two statistics (88.7% vs. 56%) aredifferent. The analysis did not attempt to generalize thesurvey results about the percentage of Facebook userswho unfriend to the general population. The analysiswas conducted only on survey respondents who didunfriend so the analysis not rely on the this generalpopulation statistic. When the sampling frame missesthe target population partially or entirely the researchercan - (1) redefine the target population to fit the framebetter, or (2) admit the possibility of coverage error instatistics describing the original population [12]. In thesurvey conducted, the survey population are Twitter usersand the sampling frame is Twitter who tweeted aboutunfriending, defriending, or unfriend, and met screeningcriteria (language, etc.). There will be coverage issuesin the sample; however, due to the exploratory nature ofthe survey and a careful description of the populationthis type of compromise may be justified.

5. Conclusions

Recruitment through social network sites can be helpfulin understanding emerging technology and social norms.Non-probability samples may be very helpful to under-

stand a phenomenon that is happening at the momentin an exploratory manner [5]. Researchers can followuptheir initial study with better (and, likely, more expensiveand time consuming) probability samples. Researchers inemerging fields often attempt to understand phenomenaas they are introduced. Studies often use undergraduatestudents and ad-hoc methods for survey recruitment. Twoprominent and influential survey studies about Facebooklargely used university undergraduates to conduct theirresearch [9, 1]. Ellison et al. [9] clearly cited in theirlimitations that the sample population is not representa-tive of the entire community in their study about socialcapital and Facebook. Acquisti and Gross [1] largelyused undergraduate students through a list of studentsinterested in participating in experimental studies andfliers posted on campus for their study on imaginedcommunities and social networking sites. Dwyer et al.[8] used ad-hoc methods based on Facebook posts,public forum posts, etc. for their study on trust andprivacy on two social networking sites. There is no listof Facebook users that is publicly available (no publiclyavailable national registry list) so compromises are oftenmade to conduct research with known limitations.

Using Twitter for recruitment allowed the research tobe conducted with a wide range of ages, diverse geogra-phy, at a low cost and in a timely manner. Facebook re-leases some information about the demographics of theiruser population so a comparison can be made betweenthe survey population (through self-report measures) andthe known user population. Certain statistical techniquesmay be used to adjust the results with covariates or quotasampling may be used to ensure representativeness.Survey respondents can come from a wider range ofgeography when Twitter is used. Survey respondentswere allowed to take the survey in an anonymous mannerand may feel compelled to provide more truthful answers[16, 10]. Those who are surveyed soon after an eventoccurred may be able to provide more accurate answersthan others because they can have a better recollectionof the event [7]. Many people on Twitter post about theirday-to-day experiences and, on average, 48 Twitter usersfit the recruitment criteria per day.

The case study provides methods to reduce non-response by decreasing failure to deliver (timely re-sponses to tweets), increasing willingness to participatebased on social exchange theory, and screening for ap-propriateness based on language and other features. Thisresearch adapts existing methods of survey recruitmentlike mail and email to new media like Twitter. Prior re-search has shown few differences between methods likein-person survey recruitment and telephone methods;recruitment through Twitter may yield similar results.One advantage to using Twitter recruitment over more

35183518

ad-hoc convenience methods is that the method maybe duplicated by others and longitudinal data may beobtained. Using individual Facebook accounts to publishsurveys may make it difficult to duplicate the workbecause the sample population is likely different unlessaccess to the same accounts is granted.

There are areas of caution for implementing Twitterrecruitment. The sample population and the populationshould be reasonably close for the survey results tobe believable. If too many surveys are conducted onTwitter the population is likely to perceive that they areover-surveyed and may not participate in future research[7]. The research topic is likely to increase interest ordecrease interest based on its saliency. As always, surveydesign and question design are important factors in non-response bias and must be carefully considered for thepopulation [7, 12].

Recruiting survey participants on Twitter is anothermethod to gain insight into the behavior of users. Thereare many modes in which to conduct survey research likephone, email, mail, etc. and recruitment through Twittermay be a viable alternative to these more establishedapproaches. There are some advantages to recruiting onTwitter like finding participants who have experienceda particular event that is under investigation. As timecontinues there will be better ways to determine howwell the sample population fits the population understudy, just as there has been a transformation from in-person interviews to mailed surveys, to telephone surveysto email-based surveys today.

References

[1] Acquisti, A. and Gross, R. (2006). Lecture Notes In Com-puter Science, chapter Imagined communities: Awareness,information sharing, and privacy on the Facebook, pages36–58. Number 4258. Springer-Verlag.

[2] Andrews, D., Nonnecke, B., and Preece, J. (2003). Elec-tronic survey methodology: A case study in reaching hard-to-involve internet users. International journal of human-computer interaction, 16(2):185–210.

[3] Blau, P. M. (1964). Exchange and Power in Social Life.John Wiley & Sons, New York.

[4] boyd, D. M. and Ellison, N. B. (2007). Social net-work sites: Definition, history and scholarship. Journal ofComputer-Mediated Communication, 13(1):1.

[5] Cooper, D. R. and Schindler, P. S. (2008). BusinessResearch Methods. McGraw Hill/Irwin, 10th edition.

[6] Couper, M. P. (2005). Technology trends in survey datacollection. Social Science Computer Review, 23(4):286–501.

[7] Dillman, D. A., Smyth, J. D., and Christian, L. M. (2008).Internet, Mail, and Mixed-Mode Surveys: The TailoredDesign Method. Wiley, 3rd edition.

[8] Dwyer, C., Hiltz, S. R., and Passerini, K. (2007). Trustand privacy concern within social networking sites: Acomparison of facebook and myspace. In Proceedings of theThirteenth Americas Conference on Information Systems.

[9] Ellison, N. B., Steinfield, C., and Lampe, C. (2007). Thebenefits of facebook "friends:" social capital and collegestudents’ use of online social network sites. Journal ofComputer-Mediated Communication, 12:1143–1168.

[10] Evans, D. C., Garcia, D. J., Garcia, D. M., and Baron,R. S. (2003). In the privacy of their own homes: Usingthe internet to assess racial bias. Personality and SocialPsychology Bulletin, 29(2):273–284.

[11] Grossman, L. (2009). Iran’s protests: Why twitter is themedium of the movement. Time, 1:1–4.

[12] Groves, R. M., Floyd J. Fowler, J., Couper, M. P.,Lepkowski, J. M., Singer, E., and Tourengeau, R. (2004).Survey Methodology. Wiley Series in Survey Methodology.WIley-Interscience, Hoboken, NJ.

[13] Honeycutt, C. and Herring, S. (2008). Beyond microblog-ging: Conversation and collaboration via twitter. HawaiiInternational Converence on System Sciences, 42:1–10.

[14] Huberman, B. A., Romero, D. M., and Wu, F. (2009).Social networks that matter: Twitter under the microscope.First Monday, 14(1):1–9.

[15] Java, A., Finin, T., Song, X., and Tseng, B. (2007).Why we twitter: Understanding microblogging usage andcommunities. In 9th WebKDD and 1st SNA-KDD 2007workshop on Web mining and social network analysis.

[16] Kiesler, S. and Sproull, L. S. (1986). Response effects inthe electronic survey. Public Opinion Quarterly, 50(3):402–413.

[17] Krishnamurthy, B., Gill, P., and Arlitt, M. (2008). A fewchirps about twitter. Workshop on Online social networks,1:19–24.

[18] Krishnamurthy, B. and Wills, C. E. (2008). Characterizingprivacy in online social networks. Workshop on Onlinesocial networks, 1:37–42.

[19] Kwak, H., Lee, C., Park, H., and Moon, S. (2010). Whatis twitter, a social network or a news media? In 19thinternational conference on World wide web, volume 10,pages 591–600. ACM.

[20] Madden, M. and Smith, A. (2010). Reputation manage-ment and social media: How people monitor their identityand search for others online. Technical report, Pew ResearchCenter, 1615 L St., NW - Suite 700, Washington, D.C.20036.

[21] Miller, C. C. and Vega, T. (2010). After building anaudience, twitter turns to ads. The New York Times,2010:online.

[22] Naaman, M., Boase, J., and Lai, C.-H. (2010). Is it reallyabout me? message content in social awareness streams.CSCW 2010, 2010:189–192.

[23] Smith, T. M. F. (1983). On the validity of inferencesfrom non-random sample. Journal of the Royal StatisticalSociety, 146(4):394–403.

35193519

Documents

Purposive Sampling on Twitter: A Case Study