35
Social Media Neighborhood Guides A PhD Thesis Proposal by Dan Tasse Committee: Jason I. Hong (Chair) Jodi Forlizzi Niki Kittur Judd Antin

Social Media Neighborhood Guides - Dan Tasse · 2020. 5. 1. · Finding 1: People use heuristics when searching ... Are the neighborhood comparisons “right”? ... people information

  • Upload
    others

  • View
    1

  • Download
    0

Embed Size (px)

Citation preview

Page 1: Social Media Neighborhood Guides - Dan Tasse · 2020. 5. 1. · Finding 1: People use heuristics when searching ... Are the neighborhood comparisons “right”? ... people information

SocialMediaNeighborhoodGuidesAPhDThesisProposalbyDanTasseCommittee:JasonI.Hong(Chair)JodiForlizziNikiKitturJuddAntin

Page 2: Social Media Neighborhood Guides - Dan Tasse · 2020. 5. 1. · Finding 1: People use heuristics when searching ... Are the neighborhood comparisons “right”? ... people information

2

Abstract Moderntouristsvisitingnewcitiesarenotcontenttosimplystayinahoteldowntownandseefamoussights.Theywanttogetoutintotheneighborhoodsofthecitythattheyarevisitingandunderstandmoreofthecity’scultureandeverydaylife.However,currentguidesremainedfocusedonstatisticsandpoints,sotouristsareunabletounderstandandfindneighborhoodstheywouldenjoy.Iproposetobuildneighborhoodguidesbasedonsocialmediapoststohelppeopleunderstandneighborhoods.Theseguideswillhavetwoparts:first,theywillallowcomparisonbetweenneighborhoodsinanewcityandneighborhoodstheyknow;second,theywilladdcontextsotravelerscanunderstandwhytheneighborhoodsaresimilar.Thesewillenablepeopletounderstandhowdifferentneighborhoodsfeel,andcontributetoourunderstandingofthecityasawhole.Theireffectivenesswillbeevaluatedthroughquantitativestudiesofthecomparisonsandqualitativestudiesofthesiteasawhole.Thisthesiswillprovidethreeresearchcontributions.First,itwillprovideevidencethatsocialmediacanhelpusunderstandcitiesbetterthansimpledemographics.Second,itwillshowhowwellsocialmediareflectsneighborhoods,andwhataspectsarebestrepresented.Finally,itwillcontributetoourknowledgeoftouristinformationsearchbythedevelopmentofafivedimensionalmodel.

Page 3: Social Media Neighborhood Guides - Dan Tasse · 2020. 5. 1. · Finding 1: People use heuristics when searching ... Are the neighborhood comparisons “right”? ... people information

3

TableofContents

Abstract......................................................................................................................................2

Chapter1:Introduction........................................................................................................4Chapter2:Background/RelatedWork............................................................................7ChangesinUrbanTourism............................................................................................................7ComputationalTouristExperienceRecommendations.......................................................7SummarizingGeographicalSocialMediaPosts......................................................................8

Chapter3:CompletedWork..............................................................................................10AnalyzingTweetsToFindWhereTweetersLive................................................................10DataCollection..............................................................................................................................................10MethodsforFindingHome......................................................................................................................11Results...............................................................................................................................................................12

UsingTweetstoCharacterizeLocations................................................................................14CreatingtheTwitterNeighborhoodTF-IDFMap...........................................................................14Results...............................................................................................................................................................15

UnderstandingTravelers'Needs..............................................................................................16Method..............................................................................................................................................................16Finding1:Peopleuseheuristicswhensearching,ifpossible...................................................17Finding2:Ifnoheuristicsareavailable,peopleattempttosatisfyfivedimensions......18Finding3:Currentsearchtoolsdonotadequatelyinvestigatethosefivedimensions.21

Chapter4:ProposedWork.................................................................................................23NeighborhoodComparisonAlgorithm...................................................................................24SafetyandRoomforEveryone:USCensusDemographicsandCrimeStatistics..............24Aesthetics:Flickrautotags.......................................................................................................................24Serendipity:WalkscoreandTransitscore.........................................................................................25IdealEveryday:ThirdPlacesfromYelp.............................................................................................25Authenticity:Tweets...................................................................................................................................26

ContextforNeighborhoodComparisons...............................................................................27Evaluation.........................................................................................................................................28Aretheneighborhoodcomparisons“right”?...................................................................................28Isthisguideuseful?Doesitreflectthecityaccurately?..............................................................29

Contributions..................................................................................................................................29Chapter5:Schedule..............................................................................................................31

Acknowledgements..............................................................................................................31References...............................................................................................................................31

Page 4: Social Media Neighborhood Guides - Dan Tasse · 2020. 5. 1. · Finding 1: People use heuristics when searching ... Are the neighborhood comparisons “right”? ... people information

4

Chapter 1: Introduction ManyofthereadersofthisthesisproposalwillsoonbeheadingtoCologne,GermanyfortheICWSM2016conference.Let’ssayyouareoneofthem.YoumayhavesomeextratimebeforeoraftertheconferencetorelaxabitandenjoyCologne.YouwillprobablystopinthefamousCologneCathedral,seetheHohenzollernBridge,andmaybeevenvisittheMuseumofChocolateifyouhaveasweettooth.ButwhataboutthemoreeverydayCologne,the“non-touristy”side,thepartthatisgreatforthepeoplewholivethere?SurelyGermany’sfourthlargestcityhasmoretoofferthanagrandcathedralandafewothertouristspots.YoumaybeinterestedinstayinginanAirBnBtoo,tosavesomemoneyandmeetsomelocals…butwhere?Whatneighborhoodiscloseenoughtotheconferencebutalsointriguingandfriendlyenoughtostayin?Thisisaspecificcaseofageneralproblem:weneednewwaystounderstandcitiesandneighborhoods.Asmorepeoplemovetocitiesthroughoutthe21stcentury,quicklyunderstandinghowplacesfeelwillbecomemoreandmoreimportant.Peoplemovingwillneedtoknowwhatneighborhoodtheywouldfeelathomein,businessownerswillneedtoknowwheretoexpandandmarket,andcityplannerswillneedtoknowhowtoallocateservicesandzonedistricts.Travelershaveauniquesetofinformationneeds,though,becausetheyarenewtoaplaceanddonothavetimetobuilduplocalknowledgefromexperience.Morethaneverbefore,too,theywantthislocalknowledge;theywanttoexperience“everydaylife”inacityand“dowhatthelocalsdo.”Unlikethesun-and-sandtouristsoftwogenerationsagoorthecultural-site-visitingtouristsoflastgeneration,today’stouristswanttocurateandcreatetheirownexperience.Andmorethanever,platformslikeAirBnBandCouchsurfinghelpthemdosobystayinginlocalneighborhoodsinsteadofcentraltouristdistricts.Toolsthatareavailabletoaddresstheseinformationneedsallfallshort.TraditionalguidebooksfromFodor’s,Frommer’s,andLonelyPlanetgivepeopleinformationaboutthosecentraltouristdistrictsandsightstosee.YelpandFoursquaregivepeopleinformationaboutthebusinesses,thebarsandrestaurantsandlocksmiths,inanarea,buttravelerscan’tunderstandhowthewholeneighborhoodfeelsjustfromthat.Citiesgatherstatistics–andindeed,arereleasingopendatamorethaneverbefore–butnumbersalsofailtoconveyaneighborhood’sculture.Finally,occasionallytravelerscanlearnlocalvernaculardescriptions,buttheseareoftenshallow.Forexample,“Lawrencevilleisthecoolneighborhood”or“SouthSideisthepartyneighborhood.”However,thankstopublicgeotaggedsocialmediaposts,wehaveenoughlocalizedinformationtohelpinformthesetravelers.TravelerswanttostayinplacesthatsatisfycertainculturalandaestheticcriteriathatarebetterreflectedinTweetsandphotosthaninstatisticsandlistsoftouristsites.

Page 5: Social Media Neighborhood Guides - Dan Tasse · 2020. 5. 1. · Finding 1: People use heuristics when searching ... Are the neighborhood comparisons “right”? ... people information

5

Iproposetobuildaweb-basedneighborhoodguidefortravelers,basedonsocialmediaposts,thatwillhelptravelersfindneighborhoodstheywillenjoystayinginandspendingtimein.Iwilldothisbycomparingneighborhoodstoneighborhoodsinacitytheyalreadyknow,tousepeople’sexistingunderstandingofneighborhoodstoscaffoldtheirunderstandingaofanewcity.Thealgorithmforcomparingneighborhoodswillbebasedonfivedimensionsderivedfromexistingresearchandfromformativeinterviews.Theneighborhoodguidewillalso,importantly,containwaystounderstandwhytwoneighborhoodsaresimilar:extracontextintheformofphotos,textexcerpts,orrelevantstatistics.Usingthisguide,touristswillbeabletofindplacestostayandplacestospendtimemoreeasily.Thiswillmakeiteasierandmorefuntotraveltobigcities,butalsomorefuntotraveltomid-sizedorsmallcitiesthatcurrentlydonotgetasmuchtouristattention.WithinternationaltraveldestinationslikeParisandVenicelosingcharacterduetoadelugeoftouristsandsmallerbutworthwhilecitieslikeClevelandandAtlantaneedingwaystoattractinvestment,thiscouldbenefittheentiretourismindustry.Furthermore,thisguidecouldbeusefulforplannersbeyondtravelers,especiallyforneighborhoodsthataregrowingordevelopingandwanttobelikeothermorepopularneighborhoods.Point-basedguidesandstatisticscanonlygosofartohelpusunderstandcrucialaspectsofacity’sculture,andtravelersnowadayswanttoknowthatculturemoreandmore.Thistoolwillhelppeopledeepentheirunderstandingofcities,andhelpresearcherslearnaboutcitiesaswell.

Page 6: Social Media Neighborhood Guides - Dan Tasse · 2020. 5. 1. · Finding 1: People use heuristics when searching ... Are the neighborhood comparisons “right”? ... people information

6

Figure1:Aconceptualoverviewoftheworkinthisthesisproposal

Page 7: Social Media Neighborhood Guides - Dan Tasse · 2020. 5. 1. · Finding 1: People use heuristics when searching ... Are the neighborhood comparisons “right”? ... people information

7

Chapter 2: Background/Related Work Relatedworkfallsintotwomaincategories.Recentworkinurbantourismprovidesmotivationfornewtoolstonavigateneighborhoods.Atthesametime,workincomputationalrecommendationoftouristexperiencessuggestsonepossibleavenuetobuildtoolsfortravelers.However,Iwillexplainwhythattrajectoryisunsatisfactory,anddescribeworkinsummarizinggeotaggedsocialmediathatoffersmorepromiseinhelpingusbuildusefultools.

Changes in Urban Tourism Whileurbantourismwasnotafocusofearlytourismresearch,ithasrecentlybecomeagrowingfield[11].Travelinpreviousdecadeshadmeanttravelingtobeaches,beautifulnaturalsites,orresorttowns,butinrecentyearsurbantourismisthefastestgrowingsegmentofthetourismmarket[4].Thecharacterofurbantourismischangingaswellasthevolume:newurbantouristswantto“experienceandfeelapartofeverydaylife.”[27]Furthermore,theyseektohaveanactivehandinco-creatingtheexperiences,ratherthanpassivelypayingforandabsorbinganexperience[2].Listsofsightstoseeandexperiencestobuynolongersuffice.Inaddition,whenmoderntouriststraveltoacity,theyareoftenlookingforanauthenticexperienceofthatcity,ratherthanamanufactureddiversion.Thesearchforauthenticityintourismhasalonghistorydatingbackatleasttothe1970s[19],butrecentdevelopmentshaveaidedthissearchinnewways,particularlywithregardtolodging.Becausehotelshistoricallyclusteredinafewareasofcities,likedowntownandnearairports,theycannotshowtravelersallthesidesofacitytheymaywanttosee,sotravelersareturningtoalternatives.Thepeer-to-peerlodgingrentalsiteAirBnB,forexample,hasbecomeapopular,andmore“authentic”,wayfortravelerstorentroomsinresidentialpartsoftown[43,49].Similarly,Couchsurfingallowsuserstostaywithlocalsforfree(oftenontheirsparecouch,hencethename)[49].Asurbantouristschangefrom“masstourists”to“cultural”and“creative”tourists[41],“mass”lodgingnolongersufficeseither.Newurbantouristswanttostayininterestingresidentialneighborhoodsandspendtime“wanderingabout”,“takinginthecity”,and“gettingamongthepeople”[2].Todothis,theyneedguidestoareas,notspecificvenues.Urbantourism,unlikeotherformsoftourismlike“sunandsand”tourism,dependsontheserendipityandspontaneitythatresultsfromgettingtoknowneighborhoods,andontheindividual'sabilitytoco-createtheirexperience.Currenttoolshelppeoplediscoverpoints,notoverallpicturesofpartsofthecities.

Computational Tourist Experience Recommendations

Page 8: Social Media Neighborhood Guides - Dan Tasse · 2020. 5. 1. · Finding 1: People use heuristics when searching ... Are the neighborhood comparisons “right”? ... people information

8

Whiletravelershavebeenchanging,plentyofworkhasgoneintoaddressingexactlytheproblemofrecommendingthingsfortouriststodo.Workinthisveinincludesrecommendationsofrestaurants[14],shops[44],travelroutes[20,32],attractionsandpointsofinterest[12],anddestinations[13].Theseallusesocialmediaanduser-generatedcontentsuchasuserlocations,socontinuinginthisveinseemslikealogicalchoice.Inaddition,siteslikeYelpandFoursquarehavedozensofuserreviews,soaggregatingreviewsandrecommendingthemosthighly-ratedspotsseemslikeanaturalsolution.However,thisapproachhasthreeshortcomings.First,peopleneedtoknowwhytheyarerecommendedeachplace.Itwouldberarefortouriststosetoutonatripsolelybecauseanalgorithmrecommendedit.Second,theysolveproblemsthatarealreadysolvedbyYelpandFoursquare:findingarestaurantorapointofinterestbyconsultingoneoftheseguidesiseasy.Finally,theseworksneglectthechangesinurbantourismdiscussedrecently.Arecommendationalgorithmwilllikelypushmorepeopletothetopdestinations,whichthenbecomeovercrowdedandnolongerasenjoyable.Instead,weneedguidestoletpeopleexploreplacesontheirowntimeandcreatetheirownconnectionstothem.

Summarizing Geographical Social Media Posts Whiletherecommendationoftouristplacesworkhasbeengoingon,aseparatesetofresearchershasbeeninvestigatingpublicgeotaggedsocialmediaposts:publicphotographsandtextpostswithlatitudeandlongitudetagsattached.Photo-sharingsites,particularlyFlickr,havebeenwellstudied,duetothevolumeandrichnessoftheirposts.Someofthisresearchhasbeendrivenbypracticalconcerns,liketheneedtoshowphotosonamap.Toyamaetal[46]developedtechniquesincludingthumbnails,pointmarkers,andisoplethstoshowhowmanyphotosexistedonamapataplacebeforesettlingonabinningapproachtheycall“mediadots.”However,thesedisplaysonlyshowthenumberofphotos,nottheircontents,soaseriesofotherprojectsworkedonsummarizingphotocontentaswellasdensity.Someofthisresearchworksonfindingasubsetofphotosthatisrepresentativeofalargerset.Jaffeetal[15]addressedtheproblemofsummarizingphotocontentbyfindingasubsetofphotosthatwouldaccuratelysummarizealargerphotoset.Theydidthisbyclusteringallofthephotosandthenrankingtheclustersbasedonfivecriteria:tagdistinguishability,photographer-distinguishability,density,imagequalities,andarbitraryrelevancefactor(suchasasearchquery).Kennedyetal[17]furtherdevelopedtheabilitytofindthe“mostrepresentative”imagefromasetofphotosusingcomputervisionfeaturessuchasSIFT.Crandalletal[6]didthesame:findingthetopN“interesting”placesineachcityanda“canonical”photofromeach.Besidesinvestigatingphotocontents,researchershaveinvestigatedwaystosummarizethetextualtagsthatusersaddtotheirphotos.Ahernetal[1]andJaffeetal[15]describetheWorldExplorer/TagMapsproject,whichsummarizedaseries

Page 9: Social Media Neighborhood Guides - Dan Tasse · 2020. 5. 1. · Finding 1: People use heuristics when searching ... Are the neighborhood comparisons “right”? ... people information

9

ofphototagsinto“representativetags”foraregion.KennedyandRattenburyexpandedthistodescribesemanticsofplacesandevents[17,38],whileKafsietalfurtherexpandedittounderstandwhichtagsarelocallyrelevant,whicharecity-level,andwhicharecountry-level[16].Summarizingtextualcontent,liketweets,issomewhateasierbecausethereislesstotalinformation,soonecanuseasimplemethodlikeawordcloud(atleastasasupplementarytool)togetasenseofalargecorpusofwords[29].Moreintelligentmethodshavebeenusedfortweets,fortaskslikeeventdetection[19].Importantlyforneighborhoods,though,Haoetalapproachhigh-levelneighborhoodmodelinginanotherinterestingmanner,creatingLocation-TopicModelsbasedonwhatuserswriteintravelogues[13].Thesealgorithms,therefore,arenowpartofourtoolbox:wehavewaystosummarizephotocontent,phototags,andplain-textmicroblogposts.However,higher-levelabstractionscanbeusefultoo.Sometimessomeonehasalotofdataofonetypeandwantsasimplesummaryofthatdata,butoftenmoreabstractrepresentationsaremoreusefulbecausewecanunderstandthembetter.TheLocation-TopicModelisoneofthesehigher-leveltools;twomorethatareworthdiscussingincludeneighborhoodboundaryfindingandneighborhoodcomparison.Theflowofpeoplethroughoutneighborhoodsisoftennotreflectedintheofficialneighborhooddivisions,butrecentworkhasbeenabletofindboundariesbasedonhumanbehaviorsuchasFoursquarecheckins[7,51]ortweets[47].Thiscanrevealaspectsofneighborhoodlifethatisotherwisehidden,suchasaneighborhoodthatcontainstwomostly-separatesocialsub-neighborhoods.Finally,neighborhoodcomparison[22]offersawayforpeopletounderstandneighborhoodsinanewcitybasedonneighborhoodsthattheyalreadyknow.Thiscanhelppeopletalkaboutimpreciseorunnamedcharacteristicsofneighborhoods–theymaynotknowwhattheylikeabouttheirhomeneighborhood,buttheyknowthattheywanttofindsomeplacelikeit.ThiswillbeakeypartoftheneighborhoodguideIproposetodevelopinChapter4.Theworkinthischapterpresentsthreetypesofwork:

• Motivationforanewkindoftravelguide,becausetravelerdesiresarechangingandcurrentguidesarenotservingtheirneeds.

• Oneapproachthat,whileusefulandinnovative,willnotsatisfythisnewgenerationoftravelers.

• Toolsthatsummarizesocialmediadata,whichprovidebothevidencethatsocialmediacanbeusefultodescribeplaces,andsometoolsthatwecanreuseforfuturework.

Page 10: Social Media Neighborhood Guides - Dan Tasse · 2020. 5. 1. · Finding 1: People use heuristics when searching ... Are the neighborhood comparisons “right”? ... people information

10

Chapter 3: Completed Work Tobuildbetterneighborhoodguidesbasedonsocialmedia,Ibeganbyinvestigatingsocialmediaandwhatitcantellusaboutthepeoplepostingit.IlookedatwhereTwitterusersliveandfoundthatwecantellwhereabout80%ofTwitteruserslive,whichmeansthatwecanaccuratelyusetheirtweetstotellusabouttheirneighborhoods.Ithenbuiltapreliminaryneighborhoodguidebasedontweets,whichrevealedafewinterestingfindingsaboutPittsburgh’sneighborhoods.Finally,Iranqualitativeinterviewswith24participantstounderstandwhatpeopleareactuallylookingforintheseguides.Iwilldescribethesethreeprojectsinthischapter.

Analyzing Tweets To Find Where Tweeters Live Ourfirststudyinvolvedaninvestigationintogeotaggedtweetsandhowwellwecouldfindthehomesofthetweeters1.Thisisacrucialfirststepinmakinguseofsocialmediadata;withoutthiscontext,itisnotclearwhetherageotaggedtweetcomesfromsomeonewhoisveryfamiliarwiththeplaceorsomeonewhojustvisitedonce.Previousworkhasfocusedonlocalizingindividualtweets[26,37]andfindingthehomesofsocialmediausers[25,36],butnoworkhasspecificallyfocusedonfindingtweeters’homesbasedontheirgeotaggedtweets.Wedidthisbygatheringtweets,askingusersfortheirhomelocations,andthentestingvariousalgorithmstoseehowaccuratelyeachonefoundusers’homelocations.

DataCollectionTobuildagroundtruthdataset,webeganbycollecting3.3milliongeotaggedtweetsviaTwitter’spublicstreamingAPI.ThisAPIallowsadevelopertolistenfornewtweetsthatmatchageographicparameterinnearrealtime,sowechosetostreamalltweetsgeotaggedwithin0.2degreeslatitudeandlongitudefromthecenterofPittsburgh.Therectangleweselectedhadcornersat(40.241667,-80.2)and(40.641667,-79.8),andwecollectedtweetsfromJanuary2014toJanuary2015.Followingotherwork[30],wecanassumethatifoursampleislessthanabout1%ofalltweets,wecollectedthevastmajorityofgeotaggedtweetsintheregion.Neartheendofthatyear,weusedourdatasetofstreamedgeotaggedtweetstocompilealistofthe4119mostprolifictweetersforanalysis,inordertoensurethatourparticipantshadenoughgeotaggedtweetstoanalyze.Werecruitedtheseprolifictweeterstotakeasurveybytweetingalinktothem.Oursurveyaskedsevenquestions:theirage,gender,homeaddress,lengthoftimetheyhadlivedthere,workaddress,standardcommutemode,andanyotherplacestheyspendalotoftime.

1TheworkdescribedinthissectionwillbeappearingintheproceedingsofICWSM2016,titled“OurHouse,InTheMiddleOfOurTweets.”

Page 11: Social Media Neighborhood Guides - Dan Tasse · 2020. 5. 1. · Finding 1: People use heuristics when searching ... Are the neighborhood comparisons “right”? ... people information

11

Respondentswerepaidwitha$5Amazon.comgiftcardviaemail.Wereceived195responses.Foreachofour195users,weusedthenon-streamingTwitterAPItogatherthatuser’sprevious3,200tweets(themaximumnumberallowedbyTwitter).WeaddedanygeotaggedtweetsthatoccurredoutsideofPittsburghtoourdataset.Thedatacollectionandsurveyprocesswereapprovedbyouruniversity’sInstitutionalReviewBoard.Ourfinaldatasetconsistedof146,852geotaggedtweetsfrom195users,whohadamedianof533geotaggedtweets(mean=753,min=15,1stquartile=271,3rdquar-tile=1050,max=3639).Theserepresentedasubsetofalloftheirtweets;themedianpercentgeotaggedwas41.1%(mean=46.2%,min=2.3%,1stquartile=25.1%,3rdquartile=61.6%,max=100%).Onenotablesurpriseinourdatasetwasthatwehadmanyyoungparticipants(mean=26.9,median=22).ThismaybebecauseTwitterismostpopularwithyoungerusers[8]orbecauseyoungerusersfeltmorecomfortablerevealingtheirpersonalinformationonoursurvey.Manyoftheseyoung18-22yearoldparticipantswerestudentswhohadmultiple“homes”:theylivedattheirfamilyhome(oftenoutsideofPittsburgh)duringthesummerandattheircampushome(inPittsburgh)duringtheschoolyear.Becausetheschoolyearlasts8monthsormore,weaskedthemonthesurveyfortheircampushome,butmanyofthemstillputtheirfamilyhome.Asaresult,wemanuallyedited19students’“home”addressestobetheircampusaddresses,basedoninspectionoftheirtweetsshowingplaceswheretheytalkedaboutbeing“home”nearauniversity.

MethodsforFindingHomeInthissection,wepresentasystematicevaluationofseveralalgorithmsforfindingusers’homes.Inthispaper,by“findingusers’homes”,wemeanpredictingalatitude-longitudepointthatisascloseaspossibletothegeocodedaddressthattheyprovided.Wedonotdoreversegeocodingtofindastreetaddress.

Baseline(ModeofGeotaggedTweets)Asatrivialbaseline,webinnedtweetsbyroundingeachtweettothenearest0.01degreeoflatitudeandlongitude,thenpredictedthatthebinwiththemosttweets(i.e.themode)wastheuser’shomelocation.

LastDestination,WeightedMedian,LargestClusterKrumm[18]foundpeople’shomesbasedonGPStracesoftheircars.Were-implementedthreeofhismethods:•LastDestination,wherewetakethemedianofthelatitudeandlongitudeofallpointsthatarethelastcoordinatepairoftheday(whereadayendsat3:00AM)•WeightedMedian,whereeachpointisweightedbythetimeuntilthenextpoint•LargestCluster,usingthescikit-learn[35]implementationofagglomerativeclusteringonalltweetlocations

Page 12: Social Media Neighborhood Guides - Dan Tasse · 2020. 5. 1. · Finding 1: People use heuristics when searching ... Are the neighborhood comparisons “right”? ... people information

12

GridSearchWebinnedtweetsasintheModealgorithm,butdidsorecursively,asin[5].Firstweroundedtweetstothenearestwholenumberdegreeanddiscardedalltweetsoutsidethemostcommonbin.Werepeatedthisroundingtothenearest0.1degree,thenearest0.01degree,thenearest0.001degree,andthenearest0.0001,predictingthelatterastheirhome.

Multi-levelDBSCANToclusterpointsinamoreprincipledway,weusedtheDBSCANalgorithm[10],asimplementedinthescikit-learnlibrary[35],toclustertweetsintoclustersofdifferentsizes.WesettheEpsparameter(maximumdistancebetweentwosamplesinthesameneighborhood)tobe0.2degrees(latitude/longitude)for“city-level”clusters,0.005degreesfor“neighborhood-level”clusters,and0.0005degreesfor“building-level”clusters2.Foreachuser,wechosethecity-levelclusterwiththemosttweets,thenchosetheneighborhood-levelclusterwiththemosttweets,thenthebuilding-levelclusterwiththemosttweets.Weguessedthatthecentroidofthebuilding-levelclusterwastheuser’shomelocation.

GridSearchWithoutCross-postsGiventhesimilaraccuracyofgridsearchandDBSCAN,wereturnedtogridsearchwithareviseddataset.Werealizedthat10.4%ofourTwitterdataset(15,261of146,852tweets)werecross-postsfromsocialapps.Theseappsinclude(indescendingorderoffrequency)Foursquare/Swarm,Instagram,Untappd,Path,CameraoniOS,Spotify,MLB.comAttheBallpark,Frontback,Wordpress.com,Klout,LivingSocial,Sportacular,andMySpace.Ineachofthesesocialapps,tweetingwasabyproductofanotheraction(asop-posedtoTwitterclientssuchasTweetdeckandTweetcaster).Furthermore,mostoftheseareintendedtobeusedoutsidethehome.Therefore,theycannothelp(andindeedwouldhurt)anyhome-findingalgorithm.WeremovedthemfromthedatasetandperformedgridsearchandDBSCANagain.Wethenreasonedthatnighttimetweets(from8:00PMto6:00AM)wouldbemorepredictiveofhomelocationthandaytimetweets,soweremoveddaytimetweetsandranouralgorithmsagain.Thisremoved77,122ofourtweets,leavinguswith54,469tweets.Wefoundthehighestaccuracyremovingbothofthesedatasets.

ResultsResultsareshowninTable1.Algorithm Cross-posts Night Median %ofusers %ofusers %ofusers

2Ofcourse,“distance”doesnotmakesenseintermsofdegreeslongitude,becausethelengthofadegreeoflongitudevariesbasedonthelatitude.However,becausemostofthepointsweconsideredwereatsimilarlatitude,weacceptedthisinaccuracyinordertotestthemethod.

Page 13: Social Media Neighborhood Guides - Dan Tasse · 2020. 5. 1. · Finding 1: People use heuristics when searching ... Are the neighborhood comparisons “right”? ... people information

13

removed only error within100m within1km within5kmMode 553m 1.5 63.1 79.0GridSearch 57m 54.4 73.3 86.7GridSearch ✓ 54m 56.2 76.8 88.1GridSearch ✓ 51m 56.2 77.3 87.6GridSearch ✓ ✓ 49m 56.9 79.0 88.2Multi-levelDBSCAN

75m 52.8 72.3 87.2

Multi-levelDBSCAN

✓ ✓ 75m 52.3 74.4 87.2

LastDestination

350m 40.5 66.7 85.6

LastDestination

✓ ✓ 520m 33.3 64.1 82.6

WeightedMedian

✓ ✓ 400m 40.5 65.6 79.0

LargestCluster

✓ ✓ 362m 33.8 69.7 87.1

Table1:Resultsforeachalgorithmtryingtopredicteachuser'shome.Bestresultsareinbold.ResultsforWeightedMedianandLargestClusterwithoutcross-postsanddaytimepostsremovedweresignificantlyworse,sowedonotpresentthemhere.

Theseresultsshowthat,ifyoutakeawaycross-postsanddaytimeposts,simplegridsearchshowswherepeoplelive.Furthermore,theseusersdonotneedtohavemanytweetsinordertobeeasilylocalizable,asshownbyTable2.LastNTweets Medianerror %ofusers

within100m%ofuserswithin1km

%ofuserswithin5km

1 245m 44.6 61.7 74.15 84m 51.3 66.3 76.210 62m 58.0 75.1 81.9100 65m 56.0 74.6 86.01000 51m 57.0 79.3 88.6Table2:ResultsusinggridsearchonthemostrecentNnon-crosspostnon-daytimetweetsforeachuser,forvariousvaluesofN.Usingmoretweetsallowsbetterprediction,butpredictionisremarkablygoodwithasfewas10tweets.

Insummary,inthisworkweshowedthatitisrelativelyeasytofindpeople’shomesbasedontheirgeotaggedtweets.Weevenimprovedtheaccuracysubstantiallyoverbaseline.Doingthisanalysisalsohelpedmeclarifythepromiseofgeotaggedsocialmedia.Findingthesepeople’shomesmakesforusefulanalysis,butonlyforthe1%oftweeterswhogeotagtheirposts.Therefore,itisunlikelytobepossibletolocateanygivenperson.However,byfindingthesepeople’shomes,wewereabletofindalotofpeoplewholiveinacertainarea,sowecouldusetheirtweetstofigureoutwhat(somepercentageof)localsaresayinginanarea.Thiscouldreallyhelpus

Page 14: Social Media Neighborhood Guides - Dan Tasse · 2020. 5. 1. · Finding 1: People use heuristics when searching ... Are the neighborhood comparisons “right”? ... people information

14

characterizedifferentplaces,whichledustotheTwitterNeighborhoodsTF-IDFMap,whichwewillexplaininthenextsection.

Using Tweets to Characterize Locations Basedontheseinsightsfromourattemptsathomefinding,JenniferChou(anundergradthatImentored)andIcreatedourfirstattemptataneighborhoodguide,theTwitterNeighborhoodTF-IDFMap(Figure2).

Figure2:MostfrequentlytweetedwordsineachPittsburghneighborhood

Thismapshowswhichtermsareusedmoreofteninoneneighborhoodthaninothers.Forexample,thePittsburghPiratesbaseballteam’shashtag#piratesistweetedinmanyneighborhoodsinPittsburgh,butitismostoftenusednearthebaseballstadium.Analyzingtweetsgivesusauniquewindowintotheseneighborhoods.Whilegovernmentscollectdemographicdata,thatonlytellsquantitativefacts:deepbutnarrow.Analyzingtweetscangiveaqualitativepictureofaneighborhood:notquitewhatpeopleinthatneighborhoodthinkorcareabout,butatleastwhatthosepeoplesay.Twitterusersareadmittedlyasmallsampleofpeopleinanarea,butourworkheresuggeststhatthosepeoplecantellususefulinsightsabouttheirneighborhoods.

CreatingtheTwitterNeighborhoodTF-IDFMapTocreatethismap,weusedthesamesetoftweetsthatwehadgatheredforthepreviousproject:3.3milliongeotaggedtweetsinthePittsburgharea.Wethenassignedeachtweetaneighborhoodbasedonitslocation,usingneighborhood

Page 15: Social Media Neighborhood Guides - Dan Tasse · 2020. 5. 1. · Finding 1: People use heuristics when searching ... Are the neighborhood comparisons “right”? ... people information

15

boundariesfromtheWesternPennsylvaniaRegionalDataCenter3.WethenappliedavariantofTF-IDFtoeachwordtofindwhichwordsarethemostindicativeofeachneighborhood.Forourpurposeshere,TF,ortermfrequency,representsthenumberoftimesthatwordappearsintweetsinthatneighborhood;DF,ordocumentfrequency,representsthenumberoftimesthatwordappearsinotherneighborhoods.TofindtheTF-IDFscoreforeachwordineachneighborhood,wedivideitstermfrequencybyitsdocumentfrequency.Wethenremovedallwordsthatweretweetedbyfewerthan5people,toreducespam.(Othercorrections,suchastheTF-IDF-UFscoreusedin[1],alsoseempromising.)

ResultsFindingsforthisprojectwereanecdotal,butsuggestedthatthiswasapromisingdirectiontocontinue.Inlookingatthemap,wefoundobviousresults,likewhereeachstadiumwas,butwealsofoundlessobviouslocalreferences.Forexample,welearnedthatthe10ashuttleontheUniversityofPittsburghcampuswasafrequenttopicofconversationandjokes(Figure3).

Figure3:Tweetsreferencingthe10abus

3https://data.wprdc.org/dataset/pittsburgh-neighborhoods770b7

Page 16: Social Media Neighborhood Guides - Dan Tasse · 2020. 5. 1. · Finding 1: People use heuristics when searching ... Are the neighborhood comparisons “right”? ... people information

16

Ininformaltalkswithfriendsandotherstudents,wefoundlotsofenthusiasmfortoolslikethis,especiallyforcitiesthattheydonotknowaswellasPittsburgh.However,wealsofoundtheresultsabitlacking.Someofthefindingswereobvious,someweresimplynamesoflocations,andmanyfailedtogiveusersagoodsenseofwhattheplacesfeltlike.Peoplethatwetalkedtowantedricherinsightsintotheneighborhoods,notjustafewkeywordsorlocations.Tounderstandwhattheymeantby“feel”and“richerinsights”,Iembarkedonaqualitativestudytounderstandhowpeoplegottoknowneighborhoodsandcitieswhentheymovedandtraveled.

Understanding Travelers' Needs Tounderstandhowpeoplemakesenseofcitiesandneighborhoodsnow,Iconductedinterviewswithrecentmoversandtravelers.IknewIcouldusesocialmediatogivepeoplesomesenseofaneighborhood,butIwantedtomakesureIwascreatingsomethingusefulthatfilledaparticularneed.Asaresult,Ifocusedonthefollowingresearchquestions:

• Whatdopeoplewanttoknowaboutneighborhoodswhenthey’removing?• Whatdopeoplewanttoknowaboutneighborhoodswhenthey’retraveling?• Whatdopeoplewishtravelersandmoversknewabouttheirneighborhood?• Whatpartsofpublicsocialmediawillbemostuseful?

MethodIrecruited17participantsinPittsburghwhoallrecentlytraveledormovedbypostingourstudyonReddit,Craigslist,andFacebook.Weaskedthemtodescribetheirsearchprocessandtheirexperiencefindinganeighborhoodtostayorlive.Wethenshowedthemprintedpagesabouttheneighborhoodstheymovedandtraveledtoandfrom:popularTwitterwordsfromtheTF-IDFmap,FlickrphotosobtainedbysearchingtheneighborhoodnamesonTwitter,thetop10mostpopularvenuesonYelp,andmarketresearchandstatisticsfromESRI’sTapestryguide4.Weaskedthemtocreatetwoone-pageguides(onefortheneighborhoodtheymoved/traveledto,andonefortheneighborhoodtheymoved/traveledfrom)bycuttingandtapingthesematerials,andwritingordrawinginanythingthatwasmissing.Thiswasmeantasanelicitationexercisetogetthemthinkingabouttheseneighborhoodsinmoredepth.CMU’sInstitutionalReviewBoardapprovedthisstudy.Afterthese17interviews,ofwhich7involvedrecenttravelersand10involvedrecentmovers,Irealizedonerecurringissue:socialmediaismuchbetterpoisedtohelptravelersthanmovers.Moverscareaboutmanyfactorsinadditiontotheneighborhood:thehouseorapartmentitself,thecostofrentoramortgage,theproximitytoanexistingjob,andthelocalschools.Someoftheirconcernsstillecho

4http://www.esri.com/data/esri_data/ziptapestry

Page 17: Social Media Neighborhood Guides - Dan Tasse · 2020. 5. 1. · Finding 1: People use heuristics when searching ... Are the neighborhood comparisons “right”? ... people information

17

thetravelers’concerns,soIretainedtheirdata,butIreorientedtheprojecttofocusontravelers.IrecruitedsevenmorerecenttravelersinSanFrancisco,bringingthetotalto24participants.Forthissecondgroup,Ididthesameinterview,butfocusedmoreonfactorsthatseemedrelevantinthefirstone:safety,liveliness,diversity,andaesthetics.Ialsoonlyrecruitedtravelersforthesecondgroup.IdidnotbringtheprintoutsoraskparticipantstocreateflyerslikeIdidforthefirstgroup,becausethecomplicationdidnotprovidemuchmorevalueorinsight.Iwillrefertotheoriginal17intervieweesasA1-A17,andthenextsevenasB1-B7.Theseparticipantswereyoung:allintheir20sand30sexceptfortwo.Eightwerestudents,whiletherestweremostlyprofessionals.Interviewswereconductedincafésorotherpublicplacesnearthemforconvenienceandtogetthemthinkingabouttheirneighborhoods.B5andB6,adatingcouple,interviewedtogether;alltherestweredoneseparately.Becausetheinterviewsoccurredinpublicplaces,Icouldnotrecordtheinterviews,butItookplentifulnotestocaptureimportantpointsaswellaspossible.Afterfinishingeachbatchofinterviews,Ianalyzedthedataiteratively,usinganopencodingapproachinspiredbygroundedtheorytoallowinsightstoemergefromthedata.Theseinterviewsrevealedalotaboutthisgroup’stravelandmovingmotivations,whattheyhopetolearnaboutneighborhoods,andwheretheydecidetostay,aswellasafewinterestingtensionsthatarisewhentheymakethosedecisions.

Finding1:Peopleuseheuristicswhensearching,ifpossibleFirst,anumberofconditionsmaycausetravelerstodoverylittleresearchbeforechoosingwheretostay.Ifsomeonealreadyhasaplacetostay,theywilllikelytakethat.B2describedthisasa“birdinthehand”situation,andsaiditoccurredalotwhenCouchsurfing:findingalocalwho’swillingtohosthimforfreecanbedifficult,sohewillusuallyaccept,regardlessofcircumstances.Ifatravelerhassocialorotherconstraints,suchasfriendsorfamilytovisitoraneventtoattend,theyusuallyconsidertourismsecondaryandstaysomewherenicenearthatconstraint.B5andB6describedgoingtotheXGames,anextremesportsevent,inAspen,Colorado:theyspentmostoftheirtimewatchingevents,sotheysimplywantedtostaynearthegames.Similarly,B4describedvisitingScottsdaleonpersonalbusiness(hedeclinedtodescribeitfurther),whichledtohimstayingintheFashionSquaredistrict.Hefounditratherunpleasant,andhadtroublegettingaround,butheneededtobenearthere.Finally,budgetconstraintswouldoftenshort-circuitthelodgingsearch.B5andB6describedanothertrip,whentheywenttoSeattlebutwantedtopickthecheapestlodgingpossible.ThisendedupbeingtheGreenTortoiseHosteldowntown,and

Page 18: Social Media Neighborhood Guides - Dan Tasse · 2020. 5. 1. · Finding 1: People use heuristics when searching ... Are the neighborhood comparisons “right”? ... people information

18

sincetheyhadstayedinanotherGreenTortoiseelsewhere,theydecideditwouldwork.B3alsodescribedaroadtripwherehesimplylookedupaplacetostaywhileontheroadeachday,onlywantingsomethingsimple,clean,andcheap.

Finding2:Ifnoheuristicsareavailable,peopleattempttosatisfyfivedimensions.Mostoftheparticipantsdescribedtripswheretheydidnotuseanyoftheseheuristics,andinsteadwantedtosatisfyfivedifferentdimensions:SafetyandRoomforEveryone,AestheticAppeal,OpportunityforSerendipity,TheIdealEveryday,andAuthenticity.Iwilldescribetheminturn.

Dimension1:SafetyandRoomForEveryoneEveryonewantedtobesafe.Themeaningofsafetyvariedslightlydependingonlocation;usuallyitincludedcrime,butA1,A15,A17,andB4allmentionedfearofbedbugswhentravelingtoNewYork.However,whenaskedifsafetywasalwaysanupside,mostparticipantsdeclined.A6describedspendingonenightinChurchillGardens,aposhpartofLondon,butthenmovingontosomewhatsimplerClerkenwell.Oftenthesafestspacesarealsothemostexpensive,andbecausetheyaresoexpensive,onlyahomogenoussetofwealthypeoplecanlivethere.Everyonewhospokeofdiversityconsidereditavirtue.Theydescribedenjoyingmarkets(B1andB6)andtrainstations(B1),astheyareplaceswherelotsofdifferentpeoplemeet.Whenaskedwhy,theyoftenmentionedgentrification.Itookcarenottointroducethetermmyself,butsevenparticipantsbroughtitupindependently.Alossofdiversitymakesaplacelessfun(B4)butalsobringsaboutchangesthatmaketheirexistenceinaplaceuncomfortable(A3,A10,A15,B1),becausethey’renotsureiftheirexistencetheredisplacesotherpeople.ThisprinciplewasbestarticulatedbyA1andB1,whobothusedthephrase“roomforeveryone”:theywanttoseeaplacethatcontainspeopleofdiverseages,races,andincomelevels.

Dimension2:AestheticAppealAestheticappealinmanyformsisoneofthemainincentivesforpeopletotravel,andoneofthemaininfluencesontheoverallfeelingofatrip.By“aestheticappeal”,Iamreferringtoanythingaboutthesenses:participantsmentionedvisual,auditory,andgustatoryappealparticularly,andoccasionallysmell.Somepreferenceswereuniversal,suchasenjoymentofnatureandavoidanceofloudplaceswhilesleeping.Otherswerepersonal:A10describedherneighborhoodasaburgeoningurbanagriculturalarea,whileB4describedthecityofPittsburghasa“concretejungle.”Manyparticipantsdescribedsuburbsas“boring”,butA7describedonesuburbashis“perspectiveofwhatcountrylivingshouldbe.”

Dimension3:OpportunityforSerendipityLodgingseekersusedaheuristiciftheyhadoneplacetotravelto(theywouldsimplystaynearthatplace),buttravelerswithoutadirectgoalstillvaluedconvenience.Whatdoes“convenience”meanwhenonedoesn’thaveagoal?Itdependsonthepersonandthecity,butparticipantstalkedabout“beingmobile”

Page 19: Social Media Neighborhood Guides - Dan Tasse · 2020. 5. 1. · Finding 1: People use heuristics when searching ... Are the neighborhood comparisons “right”? ... people information

19

(B2),“beinginthemiddleofstuff”(B4),“havingstuffaround”(A15),orbeing“whereeverythingis”(A9).Evenso,thisisnotverydescriptive.Thefinaltwointerviews,though,helpedelucidatethispoint.B6talkedaboutvisitingNewOrleansandstumblingacrossparadesputtogetherbylocalNativeAmericangroups,whichsheunexpectedlyenjoyed.B7describedstayingintheneighborhoodofItzimna,Merida,Mexico,whichwasashortwalkfromthetouristcenterofdowntown.Sheenjoyedthewalkdowntownbecauseitenabledhertodiscovermorethanifshewereactuallystayingdowntown.Bothofthesecasessupporttheclaimthat“convenience”ismorethanjustquicktraveltimetosights;it’saboutopportunitiestodiscovertheseunknowngems.Theseopportunitiesappearmoreinadenseurbanenvironmentfulloflocalbusinessesandwalkableneighborhoods.Walkabilitydeservesextraattentionhereasakeyenablerofserendipity.Travelingbycarinvolvesdifficultiessuchasdrivinginanewcity,coveringunfamiliarterrain,andparking,asB5mentioned.However,beingstuckinacar-centricenvironmentwithoutacarisunpleasant,asinB4’striptoScottsdale.Therefore,logistically,travelingiseasierwhenonecanjustwalkorusepublictransportation.Inaddition,exploringiseasierwhenastopintoastoreorcaféinvolvesjustwalkingin,notnoticingit,findingparking,andwalkingin.Whenonecanexploremore,onecanencountermoredelightful,fortuitousexperiences.

Dimension4:TheIdealEverydayOnerecurringthemewasdescribedas“takinginthecitylife”(B1),seeing“whatpeopleactuallydohere”(A9),“kindofget[ting]afeel”ofthecity(A6),andeven“play[ing]thegameof,whatifwelivedhere”(A17).Thisechoesatrendtowardstravelersusingtheeverydayasawaytocreatetheirexperience,forthetravelexperiencetobelessaboutwhattheyareconsumingandmoreaboutwhattheyarebecoming[27].Mostparticipants(exceptA16)werenottravelinginordertofindaplacetomoveto,buttheystillenjoyedpretendingtodoso.Whenpressed,though,intervieweesdidnotactuallywanttheirtravelexperiencestobeaboutthereal“everyday.”Everydaylifeinvolveswork,chores,anderrandsthatmostpeopledonotenjoy,wherevertheyare.Forexample,askedifshewouldbeinterestedtoseeeverydaylifeintheFinancialDistrictofSanFrancisco,B1repliedno,theFinancialDistrictisn’tthekindof“everyday”she’slookingfor(thoughclearlyitisanintegralpartofmanypeople’severydaylives).Instead,participantswantedtoexperiencean“idealeveryday,”whichinvolvedtworecurringsubthemes:relaxationandthirdplaces.Relaxationisself-explanatory:travelers,usuallyonleisuretrips,preferredaslow-paceddaywithfewresponsibilitiestoaquick,busyday.A1appreciatedarelaxingor“chill”environment,asdidB1,whoelaboratedthat,asabusyprofessor,sheoftendoesn’tgetachancetodothe“everyday”thingsthatarepartofthisidealday.She

Page 20: Social Media Neighborhood Guides - Dan Tasse · 2020. 5. 1. · Finding 1: People use heuristics when searching ... Are the neighborhood comparisons “right”? ... people information

20

gaveanexampleofbuyingabirthdaycardforafriend:sheplanstodothisonanupcomingtriptovisitfriends,justbecausethatwillbetheonlytimeshehastodoit.Thirdplaces,suchasbars,cafes,andbookstoresasdescribedin[33],arealsoakeypartofthis“idealeveryday.”Manyparticipantsdescribedlocalvenuestheyloved:acoffeeshopandataqueria(A13),cafeswherehe’sseenfriendsheknowssittingoutside(A14),cafesanddivebars(B4).B1wentasfarastosuggestthatshewouldtraveltoaplacebasedonwherethebestcoffeeshopswere.Becausethirdplacestendtobeneutral,accessible,status-levelingplaces,travelersappreciatethem.Steppingintoanotherplace’severydaylifeinvolvesadjustments,andthesethirdplacesgivetravelersawaytorecharge.

Dimension5:AuthenticityOnefinalrecurringthemeinvolvedtravelers’desiresforan“authentic”“non-touristy”place.Clearly,“touristy”placeshavesomedisadvantages:theyareexpensive(B6gavetheexampleofpaying£39toseetheCrownJewelsinLondon)andoftenpeopleactdifferentlythere(B7describedfeelinglikeshe“hadadollarsignonherforehead”inthetouristbeachesofCancún).Butthoseinconveniencesdon’texplaintheintensityofthedesiretobe“notatourist”(oreven“theanti-tourist”,asA9describedhimself).Furthermore,somepeopleappreciatedtouristyplaces,forpracticalreasons:B7notedthatnotspeakingSpanishlimitedherexperienceinMexico,andA6describedhowshewouldsearchforaplacethat’snotthe#1touristdestinationbutalsonotcompletelylocal,duetolanguageissues.Tounderstandthistouristinesstension,itisusefultoreviewpreviousworkaboutauthenticityintouristplaces.Earlyworklocatedallspacesona6-stagescalefromfront-stage(purelyforshow)tobackstage(fullyauthentic)andpredictedthatalltouristswouldseekauthenticity[24].Laterworkaddedmorenuance,describingthe“authenticity”ofanexperienceinninesubtypesdependingonhowauthentictheplacewas,howauthenticthepeoplewere,andwhetherthevisitorputimportanceontheauthenticityofthepeopleortheplace,both,orneither[34].Furthermore,theauthenticityofanexperiencemaybebestexplainedasexistentialauthenticity,orthepersonalresonancewiththatexperience.Existentialauthenticityhastwoforms:intra-personal(discoveringandbeingtruetooneself)andinter-personal(havingarealconnectiontoothers)[48].DifferentpeoplemayenjoyatriptotheVanGoghMuseuminAmsterdamformanyreasons.TheymayappreciateseeingtheoriginalSunflowers(objectiveauthenticity)orseeingtheofficial,definitivecollectionofVanGogh’sart(constructiveauthenticity).TheymayenjoyastirringresonancewithVanGogh’smasterfulbrushstrokes,ortheabilitytodiscussthesepaintingswiththeirfellowtourists(existentialauthenticity,intra-andinter-personalrespectively).Theymightgetusefulinformationfromthedocentsinanofficial,front-stagecapacity,ortheymightgetadocenttoreveallittle-knownbackstagestoriesaboutworkingatthemuseum.Finally,afterwards,theymaystayintheenclavictourist“bubble”oftheMuseumpleinoutside,ortheymayheadtoamoreheterogeneousneighborhood,as

Page 21: Social Media Neighborhood Guides - Dan Tasse · 2020. 5. 1. · Finding 1: People use heuristics when searching ... Are the neighborhood comparisons “right”? ... people information

21

describedin[9].Eachoftheseexperiencesmayberegardedbyonepersonas“authentic”andbyanotheras“touristy.”Thisaspectoftourism,moreeventhantheotherfour,isthereforeimpossibletodefinitivelycharacterizeormeasure,butbyaconsiderationofthepeopleinvolvedandthedifferenttypesofauthenticity,wecanhopetoprovideguidancetogetpeopletoexperiencesthatwillresonateauthenticallyforthem.

Finding3:CurrentsearchtoolsdonotadequatelyinvestigatethosefivedimensionsGiventhatthesefiveneighborhoodcharacteristics(safetyandroomforeveryone,aesthetics,serendipity,theidealeveryday,andauthenticity)matterindifferentwaysfordifferentsearchers,howdotravelerssearchforneighborhoodsnow?Theprimarysearchmethodusedwastoaskfriendsandfamily.Ifpeoplevisitedfriends,likeB2inAlbuquerqueandPortland,theycandothisdirectly;otherwise,likeA9,theywouldaskfriendsbeforehandwhatwereinterestingandfunneighborhoods.Onlineresearchwasalsowidelyused,oftenassimplyassearchingGooglefor“thingstodoin(city)”or“Londonoffthebeatenpath”(B6).B7lamented,though,thatthiskindofsearchingcanturntheusually-funprocessoftravelingintowork.Becausesearchingwassolaborintensive,somepeoplewhodidnothaveanypre-existingheuristics(asdescribedinFinding1)triedtocreatetheirownheuristics.A11wouldsearchforthe“queerestneighborhood”inagivencity,asshedidwhenshevisitedZurich.Thiswasnotinordertofindparticularsitesthere(Zurich’squeerestneighborhoodfeaturedonemaingaybarandonemainsexshop,neitherofwhichshevisited),butjustbecauseshefoundthatshewouldoftenlikethekindofpeopleshemetthere.Similarly,B1searchedforthebestcoffeeshops,notbecauseshewouldspendmostofhertimethere,butbecausesheusuallylikesneighborhoodsthathavegoodcoffeeshops.B2wouldreadbooksaboutaplace,likeGregoryDavidRoberts’snovelShantarambeforevisitingMumbai,orMayaAngeloubeforevisitingSanFrancisco,inordertorecognizeplacestheymentioned.Whengiventheprintedmaterialabouttheseplaces,participantsagreedthattheycouldbeuseful,butthereweremanycaveats.Statisticswouldbehelpful,butwouldneedcontext,especiallyforunfamiliarnumberslikedensity.Yelpandotherpoint-orientedtoolsarehelpful,butdonotdirectlysolveusers’problems.Tweets,asintheselectedwordsfromtheTwitterNeighborhoodsTF-IDFMap,wereusuallydisregarded.Finally,photosweretricky:somethoughtthattheyperfectlyreflectedtheirneighborhood,likeA11.Butsomethoughttheopposite:A13saidthatifhehadseenthephotosofhisneighborhood,hemightnothavemovedthere,thoughhelikesitnow.A11alsomentionedthatsometimesphotosrepresentaneighborhoodcoincidentally:anoctopussculptureinherneighborhoodwasoneexample,butifithadbeenpickedinanearbyneighborhood,itwouldhavepoorlyreflectedit.

Page 22: Social Media Neighborhood Guides - Dan Tasse · 2020. 5. 1. · Finding 1: People use heuristics when searching ... Are the neighborhood comparisons “right”? ... people information

22

Asaresult,weseeanopportunityforahigherlevelofabstraction.Theabstractionthatparticipantslikedthemostwascomparisontoneighborhoodsincitiesthattheyknow.Thisissimilartoworkthathasbeendonebothinresearch[22]andinpopularculture[39].Becausepeoplealreadyknowwhatneighborhoodsintheirowncityarelike,thiscangivethemaneasywaytounderstandneighborhoodsinanewcity.Iwillelaborateonhowwewilldothismoreinthenextchapter.

Page 23: Social Media Neighborhood Guides - Dan Tasse · 2020. 5. 1. · Finding 1: People use heuristics when searching ... Are the neighborhood comparisons “right”? ... people information

23

Chapter 4: Proposed Work Fromrelatedresearch,weknowthatpeoplearetravelinginnewwaysandwanttoexperiencedifferentthingswhentheytravel.Fromsomerelatedworkandsomeofmypriorwork,Iknowthatpublicgeotaggedsocialmediapostscanbeanaccurateandusefulwindowintothecultureofaneighborhood.Fromintroductoryinterviews,Ihaveidentifieddetailsaboutthedimensionspeoplewanttoexplorewhentheytravel:SafetyandRoom,Aesthetics,Serendipity,IdealEveryday,andAuthenticity.Theydon'tneedto"maximize"thesedimensions,becausethesedimensionsarecomplicatedandindividual,butshouldbeabletobrowsethem.Tohelpthemaccomplishthis,Iplantobuildaweb-basedneighborhoodguide.Thiswillinvolvetwoparts:neighborhoodcomparisonandcontext.Userswillfirstbepromptedtoprovideacitytheyknow,acitytheyaretravelingto,andaneighborhoodtouseasabasisforcomparison.

Figure4:Mockupoftheproposedneighborhoodguide

Iwilluseneighborhoodcomparisonasthecentralmetaphorbecause,inourinterviews,Ifoundthatitwasthemostcompellingmetaphortoguidepeople’s

Page 24: Social Media Neighborhood Guides - Dan Tasse · 2020. 5. 1. · Finding 1: People use heuristics when searching ... Are the neighborhood comparisons “right”? ... people information

24

neighborhoodsearches.Peopleusuallyalreadyknowaboutneighborhoodsintheirowncity,soIcanusethatknowledgetoscaffoldtheirprocessoflearningaboutanewcity.However,providingacomparison(“TheWilliamsburgofPittsburghisLawrenceville”)isnotenough.Researchsuggeststhatunintelligiblesystemscancauselackoftrustandacceptance[23],andmyintervieweesechoedthatconcern:“I’dlike[neighborhoodcomparisons],butIdon’tknowifIcouldrelyonit”(B3).Therefore,Imustdevelopanalgorithmthatcaneasilybebrokendown,sothesitecanexplainwhyLawrencevilleistheWilliamsburgofPittsburgh.Intherestofthischapter,IwillexplaintheneighborhoodcomparisonalgorithmIwillimplement,thewaysIwilladdcontexttoexplainthealgorithm’sfindings,andtheevaluationsIplantorun.

Neighborhood Comparison Algorithm Inintroductoryinterviews,Ifoundfivemaindimensionsthatpeopleusedinordertounderstandneighborhoods,soIwillbasetheneighborhoodcomparisonalgorithmonthosefivedimensions.Eachdimensionwillyieldafeaturevector.Wecancomparetwofeaturevectorsusingameasuresuchascosinesimilaritytofindasimilarityscorebetween0and1,andaddingthefivescoreswillgiveusasimilarityscoreforanypairofneighborhoods.Inthissection,Iwilldescribehowwewillfindthefeaturevectorsforeachdimension.

SafetyandRoomforEveryone:USCensusDemographicsandCrimeStatisticsThisdimensionisrelativelystraightforwardbecausetravelersinouroriginalstudypreferredbothsafetyanddiversity.FromtheUSCensus,Iplantoextractthepercentoflocalresidentswhofitintoeachdecadeagegroup,incomebrackets,andracialbreakdowns.Iwillalsofindcrimestatistics,intermsofcrimesperpersonperyearandviolentcrimesperpersonperyear.

Aesthetics:FlickrautotagsGatheringdataonaestheticcharacteristicsofneighborhoodsisamorecomplicatedendeavor,butforthiswecanturntoFlickr.Flickrphotoshavecomputervision-based“autotags”attachedtothem,whichidentifytheobjectsseenintheimage(suchas“people”or“sunset”).Asaresult,wecanusethepubliclyavailableYFCC100Mdataset[45],whichcontainsabout49milliongeotaggedphotos,tofindphotosineachneighborhood,thenidentifyhowmanytimeseachautotagshowsupinagivenneighborhood.Thiswillgiveusa1720-elementfeaturevector,asFlickrcurrentlyrecognizes1720distinctautotags.Preliminaryanalysissuggeststhatthesetagswillshowsomedifferenceincharacterbetweendifferentneighborhoods.Forexample,inSanFrancisco,the10mostcommonautotagsintheFinancialDistrictare:

Page 25: Social Media Neighborhood Guides - Dan Tasse · 2020. 5. 1. · Finding 1: People use heuristics when searching ... Are the neighborhood comparisons “right”? ... people information

25

architecture, people, building, face, blackandwhite, vehicle, monochrome, building complex, road, text

whilethe10mostcommonautotagsintheOuterSunset(aresidential/beachneighborhood)are: nature, people, landscape, face, shore, seaside, sky, road, water, coast

Serendipity:WalkscoreandTransitscoreAsexplainedinthepreviouschapter,theopportunityforserendipityinaplacedependsinalargepartonhoweasyitistogetaroundbywalking.Therefore,theserendipityfeaturevectorwillconsistoftwocomponents:thepreviouslydevelopedwalkabilityandtransit-friendlinessscoresfromWalkscore5.(Walkscorealsoprovidesabikeabilityscore,butasmosttravelersdonothavebicycles,weexpectthiswouldbelesshelpful.)

IdealEveryday:ThirdPlacesfromYelpBecausemanytravelersseemtowanttoexperiencethe“Idealeveryday”,includingarelaxedpaceandplentyofthirdplaceslikecafésandbars,IplantouseYelpreviewsofthesethirdplacestocapturehowpeopledescribeaplace.Users’starratingsarenotparticularlydescriptive(noraretheytrustworthy,asA1,A3,A9,andA10independentlymentioned),butthereviewscontainrichdescriptions.Thesedescriptionswillgiveauseranideawhethertoexpectupscalecocktailbarsorgreasyspoondiners,whichwillgivethemsomesenseofwhattheIdealEverydayinthisneighborhoodislike.However,thesereviewsareunstructuredtext.Iseetwopromisingoptionstoturnthisunstructuredtextintofeaturevectors:

1. Buildfrequencycountsofallthewordsinallofthesereviewsthroughoutthecity,thencomparetoastandardwordfrequencydistributioninanewsorWikipediacorpus,inordertoidentifythemostfrequentwordsinYelpreviews,comparedtothelanguageasawhole.Thenuseabag-of-wordsapproachforeachneighborhood’svenuestogetfrequencycountsofeachwordineachneighborhood.

2. Usedoc2vecasimplementedintheGensimlibrary[40],whichisbasedontheParagraphVectoralgorithm[21].ParagraphVectormaybeabletooutperformbag-of-wordsmodelsontaskslikethisbecauseitmaintainssomeofthestructureoftherelatedsentence.

Iplantoimplementbothoftheseandusewhicheveryieldsbetterresults.Athirdoption,iftheseoptionsproveproblematic,issimplytocreateavectoroftypesofthirdplaces.Forexample,Bloomfieldhas2cocktailbars,4divebars,2pubs,1gaybar,0sportsbars,5coffeeshops,and1tearoom.Thiswouldlikelynot

5https://www.walkscore.com/

Page 26: Social Media Neighborhood Guides - Dan Tasse · 2020. 5. 1. · Finding 1: People use heuristics when searching ... Are the neighborhood comparisons “right”? ... people information

26

beasrichaninformationsourcetoworkwith(becauseclassificationsarealwaysincompleteandlackingnuance)butitwouldbeasimpleroptionincaseoptions1and2fail.

Authenticity:TweetsAsdescribedinChapter3:CompletedWork,authenticityissomethingthattravelerswant,butthatishardtodefine.Giventhatanyonemayhaveadifferentdefinitionof“touristy”or“authentic”,perhapsthemostvaluewecancreatehereisbyreflectingwhathasbeensaidpubliclyonTwitter.Assuch,thetweetsintheneighborhoodbecomedocuments,andwecanturnthemintofeaturevectorsinthesamewayastheThirdPlacesreviewsabove.Becauseauthenticityissoindividual,givingpeopleasenseofwhatpeopleintheneighborhoodaresayingisthebestwaywecangivethemasenseofwhethertheywouldresonatewiththatneighborhood.There’snowaytolearnandpredictthe“mostauthentic”neighborhood,becausethatdesignationissubjectiveenoughtobemeaningless.Someonewantingan“authentic”old-fashionedPittsburghexperiencemightconsiderShadysideaninauthenticyuppieneighborhoodandvisittheStripDistrictinstead,whilesomeoneelsemightconsidertheStripDistrictaninauthentictouristattractionandvisitShadysidetoseewhatthe“real”peopleinPittsburghdo.

Page 27: Social Media Neighborhood Guides - Dan Tasse · 2020. 5. 1. · Finding 1: People use heuristics when searching ... Are the neighborhood comparisons “right”? ... people information

27

Again,foreachdimension,oncewehaveafeaturevector,wecancomputeitssimilaritytootherneighborhoods’featurevectorsforthatdimension.Giventhatwewillonlyhaveontheorderof50-100neighborhoodsforanygivenpairofcities,wecancomputethesesimilarityvaluesexhaustively;wedonotneedtouseanymoreefficientnearest-neighboralgorithm.ThecomputationofsimilaritybetweentwoexampleneighborhoodsisillustratedinFigure5.Notethat,whileIwillstartwithasimplearithmeticmeanofthefivesimilarityvalues,Iwilladjustthisalgorithmbasedonuserfeedback,asIwillexplaininthenextsection.

Figure5:Illustrationofsimilaritycomputationbetweentwoneighborhoods

Thisalgorithmextendspriorworkinneighborhoodcomparison[22],butIwanttoemphasizethedifferenceintheapproach.While[22]usedonlyFoursquarecheckinstocomparethevenuesindifferentneighborhoods,Iamusingfarmoretypesofsocialmediadatatodevelopamuchrichercomparison.Also,thispriorworkusedneighborhoodcharacterizationslike“thestudentneighborhood”or“thefancyshoppingdistrict”whileaskingpeopletocreatealabeleddataset;Iwillnotuseanysuchapriorilabels.Thiswillenableustocharacterizeneighborhoodsthatdonothaveasimpledescription,andincludeallofthenuancethatcomesfromsourcesbeyondlistsofthevenuesinaplace.

Context for Neighborhood Comparisons

Page 28: Social Media Neighborhood Guides - Dan Tasse · 2020. 5. 1. · Finding 1: People use heuristics when searching ... Are the neighborhood comparisons “right”? ... people information

28

Itisimportanttobeabletoexplainwhyneighborhoodsaresimilar,socontextwillbeanintegralpartofthisapplication.AsinFigure4,Iplantoshow,witheachsimilarityprediction,anindicatorofwhythosetwoneighborhoodsaresimilar.Giventhatwehavefivedistinctdimensions,whichallpredictasimilarityvaluebetween0and1,it’seasytotellwhichdimensioncontributedmosttothesimilarityrating.IntheexampleinFigure5,thevectorsfromtheYelpreviewsofthirdplaceswerethemostsimilar,soitwouldbeeasytoreport“BloomfieldandtheMissionhaveasimilarityratingof0.51,mostlybecausethebarsandcafésarethemostsimilar.”However,wecangiveevenmorecontextthanthat.Foreachdimension,wecanprovidefurthercontext.Forthedemographics,wecanshowgraphsofwhythedemographicsoftheneighborhoodsaresimilar.IftheFlickrphotoautotagsarethemostsimilar,wecouldshowwhichtagscausedthissimilarity,andshowexamplephotoswiththosetags.Wecanuseaphotosummarysuchasthosein[1,15]tobestshowrepresentativephotos.IftheWalkandTransitscoresarethemostsimilar,wecanshowtheWalkscoreandTransitScoremapsthroughanembeddedWalkscoremap.IftheYelpreviewsorTweetsarethemostsimilar,wecansurfacewhichkeyphrasesorwordscausethatsimilarity.

Evaluation Thiswebsiteaimstomakepeople’stripsbetter,sothegoldstandardstudytoevaluateitsusefulnesswouldbetohavepeoplemakeatripwithoutusingthesite,thenmakeatripwithit,andevaluatetheirenjoymentofeachtrip.Astudythislargewouldbeoutsidethescopeofthisthesis,buttherearesubsetsoftheapplicationthatcanbeevaluatedandimproved.

Aretheneighborhoodcomparisons“right”?Priorwork[22]hasapproachedneighborhoodcomparisonasaclassificationproblem,builtadatasetof“groundtruth”neighborhoodcomparisonsfromauserstudy,andmeasuredpredictionaccuracy.However,unlikemanypredictiontasks,itishardtosaywhatthe“right”answerforaneighborhoodcomparisonis.IsLawrencevillereallytheWilliamsburgofPittsburgh?IfsomeonearguesthatEastLibertyisinstead,thereisreallynowaytoproveeitherviewpointrightorwrong.Therefore,Iwillfocusnotonbeing“right,”butonbeingclose.Toevaluatethis,Iwillrunanonlineuserstudy,inwhichIrecruitpeoplewhoknowtwodifferentcities,andservethemneighborhoodcomparisonpredictionsinoneofthreeways:

• Atrandom(baseline)• Usingdemographicsandcountsofvenuesonly• Usingthefive-partalgorithmdescribedinthischapter,whichincludes

demographicsbutalsosocialmediaposts

Page 29: Social Media Neighborhood Guides - Dan Tasse · 2020. 5. 1. · Finding 1: People use heuristics when searching ... Are the neighborhood comparisons “right”? ... people information

29

Wewillthenaskthemtoevaluateifeachcomparisonisplausible(seeFigure6).Assumingthatourpredictionsexceedthebaselines,thiswillprovideevidencethatsocialmediaaidsinourunderstandingofcitiesandneighborhoods,atleastwhenviewedthroughthetraveler’slens.

Figure6:Userinterfacemockupfortheneighborhoodcomparisonaccuracyevaluationtask

Iwillrecruitparticipantsonline,fromCraigslist,Reddit,andotherformsofsocialmedia.Iwillrecruitthesepeopleonecityatatime,focusingonPittsburghandSanFrancisco,sothatIcaneasilydescribetherequest.(“HelpuscomparePittsburgh’sneighborhoodstoothercities’”iseasiertounderstandthan“Helpuscompareanytwocities.”)Ihopetorecruit50peoplepercondition,so150total.IwillrestrictrecruitmenttopeoplewhohavelivedinPittsburghorSanFranciscoandanothercityforatleast6months.

Isthisguideuseful?Doesitreflectthecityaccurately?Thiswillbemoredifficulttoevaluate,butasitismoreimportant,Iwanttoatleasttry.Iwillruntwouserstudieswithpeoplewhoareabouttogoonatrip,simplytryingtheapplicationout.Iwillinvestigatebothwhatpartsofittheyfindmostusefulandhowelsetheygatherinformation.Thiswillhelpfurtherdevelopthefivedimensionalmodel:verifythatthedimensionsI’vechosenareimportant,understandiftherearemoredimensions,andlearnmoreaboutwhytheyfindthosedimensionsimportant.Foroneofthesestudies,IwillrecruitparticipantsfromamongMHCIstudentsinMarch,travelingtoAustinfortheSXSWconference,becausetherewillbealotofthem,sowecangetdiversedataaboutoneparticularcomparisonofcities.Iwillalsorecruitpeopleastheybecomeavailablethroughouttheyear.

Contributions

Page 30: Social Media Neighborhood Guides - Dan Tasse · 2020. 5. 1. · Finding 1: People use heuristics when searching ... Are the neighborhood comparisons “right”? ... people information

30

Indevelopingtheseguides,Iwillmakeaworkingwebsitethathelpstravelersfindneighborhoodstheywillenjoy.ThestatementthatIamsettingouttoprovecanbestbestatedasfollows:Usinguser-generatedsocialmedia,wecanautomaticallygenerateguidesthatwillhelppeopleunderstandneighborhoodsinrelationtoneighborhoodstheyknow,andthereforewillhelpthemhavethetravelexperiencetheywant.Thisworkwillleadtothefollowingresearchcontributions:

• Amodeloftouristinformationsearch,focusingonfiveprimarycharacteristicsthattouristsdeemvaluabletoday,basedonformativeinterviewsandqualitativeinsightsfromuserstudies.

• Theiterativedesignandimplementationofanautomaticallygeneratedweb-basedneighborhoodguide,whichusessocialmediatoprovidecomparisonsbetweenneighborhoodsindifferentcitiesandtoprovidecontextforthesecomparisons.

• Adeeperunderstandingofhowsocialmediacanrepresentneighborhoods,basedonthedevelopmentanditerativefeedbackonthisguide.Thiswilllikelyincludelearningwhichformsofsocialmediaaremostimportanttotravelersandhowbesttosummarizethem.

Page 31: Social Media Neighborhood Guides - Dan Tasse · 2020. 5. 1. · Finding 1: People use heuristics when searching ... Are the neighborhood comparisons “right”? ... people information

31

Chapter 5: Schedule EarlytomidMay2016:conferencepresentationsatCHIandICWSM,writeCSCWpaperbasedonintroductoryinterviews.LateMay-June2016:investigatealgorithmsforwordembeddinginvectorspacestodeterminewhichmethodtouse.DevelopscraperstodownloadYelpreviewsandFlickrautotags.July-August2016:weddingandhoneymoontoChinaLateAugust-September2016:createapreliminary/skeletonwebsite,inordertobeabletouseitasatestbed.October2016:includedatafromtwocitiesinthesite.Recommendneighborhoodsusingapreliminary/skeletonalgorithm.November2016:runaninitialqualitativeuserstudyDecember2016:continuedevelopmentonsiteJanuary2017:runthequantitativeuserstudycomparingthreedifferentcomparisonalgorithms(random,demographicsonly,anddemographicsplussocialmedia)February2017:furthercontinuesitedevelopmentMarch2017:runsecondqualitativeuserstudywithMHCIstudentsgoingtoSXSW.April-May2017:writethesis,defendinMay

Acknowledgements JenniferChouhelpedwiththeconstructionoftheTwitterNeighborhoodsTF-IDFMapwebappanddevelopingourTF-IDFalgorithm.AlexSciutohelpedwithanalyzingdataandpaperwritingforthepaper“OurHouse,InTheMiddleOfOurTweets.”

References 1. Ahern,S.,Naaman,M.,Nair,R.,andYang,J.H.-I.WorldExplorer:Visualizing

AggregateDatafromUnstructuredTextinGeo-ReferencedCollections.Proceedingsofthe2007conferenceonDigitallibraries-JCDL’07,(2007),1.

2. Ashworth,G.,&Page,S.J.(2011).Urbantourismresearch:Recentprogressandcurrentparadoxes.TourismManagement,32(1),1–15.http://doi.org/10.1016/j.tourman.2010.02.002

3. Bock,K.(2015).Thechangingnatureofcitytourismanditspossibleimplicationsforthefutureofcities.EuropeanJournalofFuturesResearch,3(1),20.http://doi.org/10.1007/s40309-015-0078-5

4. Buck,M.,Ruetz,D.,&Freitag,R.(2014).ITBWorldTravelTrendsReport.

Page 32: Social Media Neighborhood Guides - Dan Tasse · 2020. 5. 1. · Finding 1: People use heuristics when searching ... Are the neighborhood comparisons “right”? ... people information

32

5. Cheng,Z.,Caverlee,J.,Lee,K.,&Sui,D.(2011).ExploringMillionsofFootprintsinLocationSharingServices.ICWSM.Retrievedfromhttp://www.aaai.org/ocs/index.php/ICWSM/ICWSM11/paper/download/2783/3292

6. Crandall,D.,Backstrom,L.,Huttenlocher,D.,andKleinberg,J.MappingtheWorld’sPhotos.Proceedingsofthe18thInternationalConferenceonWorldWideWeb,Madrid,(2009),761–770.

7. Cranshaw,J.,Schwartz,R.,Hong,J.I.,andSadeh,N.TheLivehoodsProject:UtilizingSocialMediatoUnderstandtheDynamicsofaCity.ICWSM,(2012).

8. Duggan,M.,Ellison,N.B.,Lampe,C.,Lenhart,A.,&Madden,M.(2015).Socialmediaupdate2014.PewResearchCenter,(January),18.http://doi.org/10.1111/j.1083-6101.2007.00393.x

9. Edensor,T.(2001).Performingtourism,stagingtourism.TouristStudies,1(1),59–81.http://doi.org/10.1177/146879760100100104

10. Ester,M.,Kriegel,H.-P.,Sander,J.,&Xu,X.(1996).ADensity-BasedAlgorithmforDiscoveringClustersinLargeSpatialDatabaseswithNoise.InKDD(Vol.2,pp.635–654).http://doi.org/10.1.1.71.1980

11. Füller,H.,&Michel,B.(2014).“StopBeingaTourist!”NewDynamicsofUrbanTourisminBerlin-Kreuzberg.InternationalJournalofUrbanandRegionalResearch,38(4),1304–1318.http://doi.org/10.1111/1468-2427.12124

12. Gao,Y.,Tang,J.,Hong,R.,Dai,Q.,Chua,T.-S.,&Jain,R.(2010).W2Go:atravelguidancesystembyautomaticlandmarkranking.ProceedingsoftheInternationalConferenceonMultimedia-MM’10,123.http://doi.org/10.1145/1873951.1873970

13. Hao,Q.,Cai,R.,Wang,C.,etal.EquipTouristswithKnowledgeMinedfromTravelogues.Proc.ofthe19thInternationalWorldWideWebConference,(2010),1–10.

14. Horozov,T.,Narasimhan,N.,&Vasudevan,V.(2006).UsinglocationforpersonalizedPOIrecommendationsinmobileenvironments.Proceedings-2006InternationalSymposiumonApplicationsandtheInternet,SAINT2006,2006,124–129.http://doi.org/10.1109/SAINT.2006.55

15. Jaffe,A.,Naaman,M.,Tassa,T.,andDavis,M.Generatingsummariesandvisualizationforlargecollectionsofgeo-referencedphotographs.Proceedingsofthe8thACMinternationalworkshoponMultimediainformationretrieval-MIR’06,(2006),89.

16. Kafsi,M.,Cramer,H.,Thomee,B.,andShamma,D.a.DescribingandUnderstandingNeighborhoodCharacteristicsthroughOnlineSocialMedia.WWW,(2015).

17. Kennedy,L.,Naaman,M.,Ahern,S.,Nair,R.,andRattenbury,T.HowFlickrHelpsusMakeSenseoftheWorld:ContextandContentinCommunity-ContributedMediaCollectionsCategoriesandSubjectDescriptors.ACMMultimedia,(2007).

18. Krumm,J.(2007).InferenceAttacksonLocationTracks.PervasiveComputing,10(Pervasive),127–143.http://doi.org/10.1007/978-3-540-72037-9_8

Page 33: Social Media Neighborhood Guides - Dan Tasse · 2020. 5. 1. · Finding 1: People use heuristics when searching ... Are the neighborhood comparisons “right”? ... people information

33

19. Krumm,J.andHorvitz,E.Eyewitness:IdentifyingLocalEventsviaSpace-TimeSignalsinTwitterFeeds.Proceedingsofthe23ndACMSIGSPATIALInternationalConferenceonAdvancesinGeographicInformationSystems,(2015).

20. Kurashima,T.,Iwata,T.,Irie,G.,&Fujimura,K.(2010).Travelrouterecommendationusinggeotagsinphotosharingsites.Proc.19thACMInternationalConferenceonInformationandKnowledgeManagement,579–588.http://doi.org/10.1145/1871437.1871513

21. Le,Q.,&Mikolov,T.(2014).DistributedRepresentationsofSentencesandDocuments.InternationalConferenceonMachineLearning-ICML2014,32,1188–1196.Retrievedfromhttp://arxiv.org/abs/1405.4053

22. LeFalher,G.,Gionis,A.,&Mathioudakis,M.(2015).WhereistheSohoofRome?Measuresandalgorithmsforfindingsimilarneighborhoodsincities.InICWSM.

23. Lim,B.Y.,Dey,A.K.,&Avrahami,D.(2009).Whyandwhynotexplanationsimprovetheintelligibilityofcontext-awareintelligentsystems.Proceedingsofthe27thInternationalConferenceonHumanFactorsinComputingSystems-CHI09,2119–2129.http://doi.org/10.1145/1518701.1519023

24. MacCannell,D.(1977).StagedAuthenticity:arrangementsofSocialSpaceinTouristSettings.AmericanJournalofSociology,682(3),678–682.

25. Mahmud,J.,Nichols,J.,&Drews,C.(2013).HomeLocationIdentificationofTwitterUsers.ACMTransactionsonIntelligentSystemsandTechnology.Retrievedfromhttp://tist.acm.org/papers/TIST-2012-11-0192.R1.pdf

26. Mahmud,J.,Nichols,J.,&Drews,C.(2012).WhereIsThisTweetFrom?InferringHomeLocationsofTwitterUsers.InProceedingsoftheSixthInternationalAAAIConferenceonWeblogsandSocialMedia(pp.511–514).

27. Maitland,R.(2010).Everydaylifeasacreativeexperienceincities.InternationalJournalofCultureTourismandHospitalityResearch,4(3),176–185.http://doi.org/10.1108/17506181011067574

28. Maitland,R.(2013).BackstageBehaviourintheGlobalCity:TouristsandtheSearchforthe“RealLondon.”Procedia-SocialandBehavioralSciences,105(0),12–19.http://doi.org/10.1016/j.sbspro.2013.11.002

29. McNaught,C.andLam,P.Usingwordleasasupplementaryresearchtool.QualitativeReport15,3(2010),630–643.

30. Morstatter,F.,Pfeffer,J.,Liu,H.,&Carley,K.(2013).Isthesamplegoodenough?ComparingdatafromTwitter’sstreamingAPIwithTwitter'sfirehose.ProceedingsofICWSM,400–408.Retrievedfromhttp://www.aaai.org/ocs/index.php/ICWSM/ICWSM13/paper/viewPDFInterstitial/6071/6379

31. Mummidi,L.N.,&Krumm,J.(2008).Discoveringpointsofinterestfromusers’mapannotations.GeoJournal,72(3-4),215–227.http://doi.org/10.1007/s10708-008-9181-5

32. Okuyama,K.,&Yanai,K.(2013).AtravelplanningsystembasedontraveltrajectoriesextractedfromalargenumberofgeotaggedphotosontheWeb.TheEraofInteractiveMedia,657–670.http://doi.org/10.1007/978-1-4614-3501-3_54

Page 34: Social Media Neighborhood Guides - Dan Tasse · 2020. 5. 1. · Finding 1: People use heuristics when searching ... Are the neighborhood comparisons “right”? ... people information

34

33. Oldenburg,R.(1989)TheGreatGoodPlace:Cafes,CoffeeShops,Bookstores,Bars,HairSalons,andOtherHangoutsattheHeartofaCommunity.DaCapoPress,Boston.

34. Pearce,P.L.,&Moscardo,G.M.(1986).TheConceptofAuthenticityinTouristExperiences.JournalofSociology,22(1),121–132.http://doi.org/10.1177/144078338602200107

35. Pedregosa,F.,Varoquaux,G.,Gramfort,A.,Michel,V.,Thirion,B.,Grisel,O.,etal.(2011).Scikit-learn:MachinelearninginPython.JournalofMachineLearningResearch,12,2825–2830.Retrievedfromhttp://dl.acm.org/citation.cfm?id=2078195

36. Pontes,T.,Vasconcelos,M.,Almeida,J.,Kumaraguru,P.,&Almeida,V.(2012).WeKnowWhereYouLive:PrivacyCharacterizationofFoursquareBehavior.Proceedingsofthe2012ACMConferenceonUbiquitousComputing-UbiComp’12,898.http://doi.org/10.1145/2370216.2370419

37. Priedhorsky,R.,Culotta,A.,&Valle,S.Y.Del.(2014).InferringtheOriginLocationsofTweetswithQuantitativeConfidence.InCSCW(pp.1523–1536).

38. Rattenbury,T.,Good,N.,andNaaman,M.Towardsautomaticextractionofeventandplacesemanticsfromflickrtags.Proceedingsofthe30thannualinternationalACMSIGIRconferenceonResearchanddevelopmentininformationretrieval-SIGIR’07,(2007),103.

39. Read,M.(2014).ThisIstheWilliamsburgofYourCity:AMapofHipAmerica.RetrievedJanuary1,2016,fromgawker.com/this-is-the-williamsburg-of-your-city-a-map-of-hip-ame-1460243062

40. Řehůřek,R.,&Sojka,P.(2010).SoftwareFrameworkforTopicModellingwithLargeCorpora.ProceedingsoftheLREC2010WorkshoponNewChallengesforNLPFrameworks,45–50.

41. Richards,G.(2010).TourismDevelopmentTrajectories-FromCulturetoCreativity?EncontrosCientíficos-Tourism&ManagementStudies,(6),9–15.http://doi.org/10.4324/9780203933695

42. Shneiderman,B.(1996).Theeyeshaveit:ataskbydatatypetaxonomyforinformationvisualizations.InProceedings1996IEEESymposiumonVisualLanguages(pp.336–343).http://doi.org/10.1109/VL.1996.545307

43. Stors,N.,&Kagermeier,A.(2015).MotivesforusingAirbnbinmetropolitantourism–whydopeoplesleepinthebedofastranger?RegionsMagazine,299(1),17–19.http://doi.org/10.1080/13673882.2015.11500081

44. Takeuchi,Y.,&Sugimoto,M.(2006).CityVoyager:AnOutdoorRecommendationSystemBasedonUserLocationHistory.UbiquitousIntelligenceandComputing,4159(Figure1),625–636.http://doi.org/10.1007/11833529_64

45. Thomee,B.,Shamma,D.a.,Friedland,G.,Elizalde,B.,Ni,K.,Poland,D.,…Li,L.(2015).TheNewDataandNewChallengesinMultimedia.arXivPreprintarXiv:1503.01817,1–7.http://doi.org/10.1145/2812802

46. Toyama,K.,Logan,R.,&Roseway,A.(2003).Geographiclocationtagsondigitalimages.ProceedingsoftheEleventhACMInternationalConferenceonMultimedia-MULTIMEDIA’03,(November),156.http://doi.org/10.1145/957044.957046

Page 35: Social Media Neighborhood Guides - Dan Tasse · 2020. 5. 1. · Finding 1: People use heuristics when searching ... Are the neighborhood comparisons “right”? ... people information

35

47. Wakamiya,S.,Lee,R.,andSumiya,K.Crowd-sourcedCartography:MeasuringSocio-cognitiveDistanceforUrbanAreasbasedonCrowd’sMovement.Proceedingsofthe2012ACMConferenceonUbiquitousComputing,(2012),935–942.

48. Wang,N.(1999).Rethinkingauthenticityintourismexperience.AnnalsofTourismResearch,26(2),349–370.http://doi.org/10.1016/S0160-7383(98)00103-0

49. Yannopoulou,N.,Moufahim,M.,&Bian,X.(2013).User-GeneratedBrandsandSocialMedia:CouchsurfingandAirbnb.ContemporaryManagementResearch,9(1),85–90.http://doi.org/10.7903/cmr.11116

50. YelpWordmap.http://www.yelp.com/wordmap/sf,2016.http://www.yelp.com/wordmap/sf.

51. Zhang,A.X.,Noulas,A.,Scellato,S.,andMascolo,C.Hoodsquare:ModelingandRecommendingNeighborhoodsinLocation-basedSocialNetworks.SocialCom,(2013),1–15.