If you can't read please download the document
Upload
daniel-tunkelang
View
11.285
Download
0
Embed Size (px)
Citation preview
The War on Attention Poverty:Measuring Twitter Authority
Daniel TunkelangGoogle
http://www.wvculture.org/history/thisdayinwvhistory/0424.html
Disclaimers
Much of the material in this presentation is work done prior to my employment at Google.
Google is not, to the best of my knowledge, using TunkRank.
Any opinions expressed are my own, and do not represent Google's official positions.
Executive Summary
Authority requires scarcity.
http://www.southparkstudios.com/
http://en.wikipedia.org/wiki/Diamond
Overview
Aboutness and Authority
Social Networks 101
Measuring Twitter Authority
TunkRank
Aboutness and Authority
http://www.ncgenealogy.org/blogs/ngs2009/2009_04_01_archive.html
http://www.clker.com/clipart-2406.html
Information Retrieval: Pre-Web
http://archimedes.fas.harvard.edu/presentations/2002-03-09/img13.html
Information Retrieval: Web
http://blogoscoped.com/archive/2007-01-11-n25.html
How Authority Matters for IR
Promoting official content
Demoting spam
Ranking everything in between
http://whitehouse.org/
Social Networking Sites
2003: goes live
2010: claims 400M+ users
Global Alexa Top 30 also include:
Social Networks = Information Feeds
Social Information Overload!
http://loiclemeur.com/english/2007/06/im-overload.html
What's a Friend?
Bands of Reduced Attention
http://bhc3.wordpress.com/2009/02/25/the-serendipity-of-attention/
Asymmetric Follower Model
http://www.engineeringdaily.net/brain-game-weighing-24-coins/
Follower Count as Status
http://www.southparkstudios.com/
Follower Count as Authority?
http://loiclemeur.com/english/2008/12/
twitter-we-need-search-by-authority.html
http://twithority.com/
Buy Followers...on eBay!
Exploit Norm of Reciprocity
72% of users
....follow at least 80% of their followers
80% of users...
...have at least 80% of their friends as followers
TwitterRank: finding topic-sensitive influential twitterers. [Weng et al, WSDM 2010]
Do Actions Speak Louder?
influence = potential of an action of a user to initiate a
further action by another user
The Influentials: New Approaches for Analyzing Influence on Twitter
[Leavitt et al, 2009]
Dan Zarrella's ReTweetability Metric:
Gaming Retweet Count
Create two users. Tweet. Retweet. Repeat.
Retweet counts are low: less than 2% of tweets
State of the Twittersphere [Zarrella, June 2009]
Twitter cyborgs already produce retweet spam
Twitter Cyborgs [Mowbray and Andrade, 2010]
Actions can be (and are) Faked
What Should We Measure?
in an information-rich world, the
wealth of information means...
a scarcity of whatever it is
that information consumes...
the attention of its recipients.
Designing Organizations for
an Information-Rich World
[Herbert Simon, 1971]
Introducing...TunkRank!
Demo
http://tunkrank.com/
Retweet Decision Model
Simple Recurrence
Measures expected propagation of tweet from X
pnotice = total attention user devotes to Twitterpretweet =
probability that user retweets Note Following(Y) in
denominator!
Discourages Exploiting Reciprocity
Indiscriminate followers who follow many users make low contributions to TunkRank.
Consistent with idea that influence correlates to high follower-friend ratio.
But TunkRank only considers user's followers, not user's friends.
TunkRank Pros and Cons
Based entirely on follower graph.Ignores retweets, etc.
Resists manipulation.
Uniformly distributes attention among followers.Distribution is probably a power law.
But fake follow data is hidden.
Bug or a feature?
Press
http://techcrunch.com/2010/06/16/barackobama-techcrunch-
twitter-followers/
http://blogs.forbes.com/firewall/2010/07/09/a-better-way-to-filter-
twitters-spambots-ask-google/
Research
TwitterRank: finding topic-sensitive influential twitterers.
[Weng et al, 2010] Overcoming Spammers in Twitter A Tale of Five
Algorithms [Gayo-Avello and Brenes, 2010] Nepotistic Relationships
in Twitter and their Impact on Rank Prestige Algorithms
[Gayo-Avello, 2010]
Go TunkRank! [Gayo-Ayello, 2010]
similar to PageRank but better vs. cheating
aggressive marketers almost indistinguishable from common users
spammers grab small amount of global availableprestige
agrees with PageRank for top-ranked users
simple, induces plausible rankings, severely penalizes spammers compared to PageRank
Room for Improvement
Still can be gamed through fake users.
Multiply by follow cost?
Consider user actions?
Topic-sensitivity?
Non-uniform distribution?
Tradeoff of simplicity vs. realism.
http://followcost.com/
Conclusion
Web IR is unthinkable without modeling attention scarcity.
Social networks are new and increasingly important information feeds.
We need measures to mitigate social information overload.
TunkRank is a promising proof-of-concept.
Thank you!
...and thanks to Jason Adams for developing
and maintaining the http://tunkrank.com site!
Questions?
Email: [email protected]: @dtunkelangBlog: http://thenoisychannel.com/