Searching Twitter: Separating the Tweet from the Chaff

Preview:

DESCRIPTION

This presentation was given at ICWSM 2011. In this presentation, we report on a qualitative investigation into the different factors that make tweets ‘useful’ and ‘not useful’ for a set of common search tasks. The investigation found 16 features that help make a tweet useful, noting that useful tweets often showed 2 or 3 of these features. ‘Not useful’ tweets, however, typically had only one of 17 clear and striking features.Our results contribute a novel framework for extracting useful information from real-time streams of social-media content

Citation preview

Searching Twitter: Separating the Tweet from the ChaffJonathan Hurlock & Max L. Wilson

You sure can!

How do I follow

my Interests?

http://www.flickr.com/photos/stevegarfield/5397972626/

Yet more DataMeta Data, Profile Data, Linked Data

Any of it Useful?Who cares how much data there is!

“I think the challenge not only for twitter, but for the technology industry at large. Is building more relevant filters, in real time. Like being able to surface valuable information immediately. No matter who it is, whoʼs listening or whoʼs broadcasting, is a really really hard problem, and it makes twitter alot more meaningful[... ]Weʼve gotten really really good at being able to put content in, into media [...] getting it out in a relevant, valueable way, in real time is still very difficult.” - Jack Dorsey (Creator of Twitter)

Why Twitter?Where is the value?

ƒ

!₱₧£

₤₣

!

₠₥

¢

$

₡ƒ! ₱

£

₣ !₠₥

¢

$

!

Lets go back...

http://www.flickr.com/photos/milesdeelite/5309712846/

Lets go back...Great Scott!

http://www.flickr.com/photos/milesdeelite/5309712846/

Asking FriendsHey, what are you doing?

you me&

Social SearchWhat is everyone else doing?

you me&

Social SearchWhat is everyone else doing?

you me&

friendfriendfriend

friend

Existing KnowledgeNo need to reinvent the wheel

you me&

bob lisa&

Meredith Ringel Morris, Jaime Teevan, and Katrina Panovich. 2010. What do people ask their social networks, and why?: a survey study of status message & behavior. In Proceedings of the 28th international conference on Human factors in computing systems (CHI '10). ACM, New York, NY, USA, 1739-1748.

Existing KnowledgeNo need to reinvent the wheel

you

mebob

lisa

&

Meredith Ringel Morris, Jaime Teevan, and Katrina Panovich. 2010. What do people ask their social networks, and why?: a survey study of status message & behavior. In Proceedings of the 28th international conference on Human factors in computing systems (CHI '10). ACM, New York, NY, USA, 1739-1748.

Lets go back to the networkRemember...

you me&

and if we take a step back...Please mind the gap

friendfriendfriend

friend you me

We start to see interesting things...

Which have value!

Location, experiences, temporal data

http://en.wikipedia.org/wiki/File:Plane_crash_into_Hudson_River_(crop).jpg

http://www.flickr.com/photos/gcaw/5445225362/

http://www.flickr.com/photos/seanhobson/3256437306/

http://www.flickr.com/photos/mdid/4560003881/http://www.flickr.com/photos/24423474@N08/4999891492/ Yardi, Sarita and Boyd, Danah. ICWSM 2010.

Tweeting from the Town Square: Measuring Geographic Local Networks

Location, experiences, temporal data Political upheaval,

http://en.wikipedia.org/wiki/File:Plane_crash_into_Hudson_River_(crop).jpg

http://www.flickr.com/photos/gcaw/5445225362/

http://www.flickr.com/photos/seanhobson/3256437306/

http://www.flickr.com/photos/mdid/4560003881/http://www.flickr.com/photos/24423474@N08/4999891492/

emergency events .. so what are you tweeting now?

Yardi, Sarita and Boyd, Danah. ICWSM 2010. Tweeting from the Town Square: Measuring Geographic Local Networks

Twitter SearchHow do you find useful information?

Displaying ResultsRealtime

Time, ReTweets, Location, Popularity? RTDisplaying Results

http://www.flickr.com/photos/publicenergy/394124407/

Time, ReTweets, Location, Popularity? RTDisplaying Results

http://www.flickr.com/photos/publicenergy/394124407/

Displaying ResultsMaking sense of the data.

Displaying ResultsMaking sense of the data.

Michael S. Bernstein, Bongwon Suh, Lichan Hong, Jilin Chen, Sanjay Kairam, Ed H. Chi. Eddi: Interactive Topic-based Browsing of Social Status Streams. In Proc. of ACM User Interface Software and Technology (UIST) conference, Oct. 2010. New York, NY.

Displaying ResultsMaking sense of the data.

Diakopoulos, N.; Naaman, M.; Kivran-Swaine, F.; , "Diamonds in the rough: Social media visual analytics for journalistic inquiry," Visual Analytics Science and Technology (VAST), 2010 IEEE Symposium on , vol., no., pp.115-122, 25-26 Oct. 2010

Not necessarily useful!Interestingness

http://www.flickr.com/photos/wwarby/2460655511/

Naveed, Nasir and Gottron, Thomas and Kunegis, Jérôme and Alhadi, Arifah Che (2011) Bad News Travel Fast: A Content-based Analysis of Interestingness on Twitter. pp. 1-7. In: Proceedings of the ACM WebSci'11, June 14-17 2011, Koblenz, Germany.

What makes us unique?How we are different?

usefulness

What constitutes a useful Tweet?Finding Usefulness!

http://www.flickr.com/photos/edduddiee/4346349664/

How did we go about this?The Method

3 Information Seeking TasksInformation Seeking

http://www.flickr.com/photos/ivyfield/4731067396/

http://www.flickr.com/photos/anniemole/241655156/

http://www.bbc.co.uk/proms/2010/share/badgewidget.shtml

Teevan, J., Ramage, D., & Morris, M. R. (2011). #TwitterSearch: a comparison of microblog search and web search. WSDM '11: Proceedings of the fourth ACM international conference on Web search and data mining (pp. 35-44). New York, NY, USA: ACM.

They were really nice people!20 Participants

A simple, easy to understand interfaceSearch Interface

To help us provide more insightThink aloud + Interviews

It’s useful because...

I didn’t because...

Lots and lots of it!Analysis K∑

Inductive Coding = Lots of Post-its!Grounded Theory

Glaser, B. G., & Strauss, A. L. (2009).The Discovery of Grounded Theory: strategies for qualitative research.Piscataway, New Jersey, USA: Transaction Publishers.

Cohen... Fleiss....Kappa Analysis

Landis, R. J., & Koch, G. G. (1977). The Measurement of Observer Agreement for Categorical Data. Biometrics , 33 (1), 159-174.

Multi Coded KappaExtended Kappa Analysis

Harris, J. K., & Burke, R. C. (2005). Do you see what I see? An application of inter-coder reliability in qualitative analysis. American Public Health Association 133rd Annual Meeting & Exposition. Washington, DC, USA: American Public Health Association.

0.73 (Substantial Agreement) Between Evaluators

&0.62 (Substantial Agreement)

with Independent Untrained Coder

Useful & Not-UsefulWhat did we find?

In Tweet ContentIn Tweet ContentExperience Someone reporting a personal experience, but not necessarily suggestion / direction.

Direct Recommendation

Someone making a direct recommendation, but not necessarily relaying a personal experience.

Social Knowledge Containing information that is spreading socially, or becoming general knowledge.

Specific Information Where facts are listed directly in tweets e.g. prices, times etc.

Reflection on TweetReflection on TweetEntertaining The reader finds them amusing.

Shared Sentiment The reader agrees with the author of the tweet.

RelevantRelevantTime The time is current

Location The location is relevant to the query.

Useful

TrustTrustTrusted Author The twitter account has a reputation / following

Trusted Avatar The visual appearance cultivates trust.

Trusted Link A link to a trustworthy recognisable domain.

LinksLinksActionable Link The user can perform a transaction by using the link (heavily dependent on trust)

Media Link The link is to rich multimedia content.

Useful Link The link provides valuable information content, e.g. authoritative information, educated reviews

Meta TweetMeta TweetReTweeted Lots Its information that others have passed on lots

Conversation Its part of a series of tweets, and they all need to be useful

Useful (cont.)

Tweet ContentTweet ContentNo Information Absence of anything, event, factual points

Introspective Personal content and personal thoughts for no social benefit

Off Topic Result not related to the query give / TF-IDF irrelevant

Too Technical The content requires specific domain knowledge the resader doesn’t possess

Poorly Constructed Tweets that may have grammatical / spelling errors, or malformed URLs.

Bad TweetsBad TweetsSPAM Irrelevant or inappropriate messages

Wrong Language Messages sent in a foreign language of that to the reader

Dead Link A URL which does not work i.e. a 404

Not RelevantNot RelevantTime Out of date content

Location Wrong geographic location

Not Useful

TrustTrustUn-truested Author An author the reader feels at un-eased by or suspicious of.

Un-trusted Link A link the reader feels is suspicious

SubjectiveSubjective

Perspective Oriented A tweet that is perspective centric, meaning the author is providing their view or projecting an attitude on a subject matter or to a subject / reader.

Disagree with Tweet A conflict of aggreement between the reader and the author

Not Funny A tweet that is aimed to be humorous, which the reader does not feel is humorous.

Meta TweetMeta TweetQnA Part of a conversation, reader desires the whole convo. not just the question or the answer.

Repeated Content the reader has seen before.

Not Useful (cont.)

Interesting findsInsights

http://www.flickr.com/photos/foxmulderven/3063598624

Where could we see the impact of this work?The Possible Impact

A work in progressSearch System

So just remember.Conclusions

Thank you for ListeningJonathan Hurlock

Max L. Wilson

@jonhurlock

@gingdottwit

http://moourl.com/LikedTheTalk

Like the talk? Then please tweet it, by quickly visiting:

Recommended