View
1.809
Download
4
Category
Preview:
DESCRIPTION
presented at Stanford Open Source lab unconference and Recent Changes Camp 2008
Citation preview
2007-06-17Ed H. Chi - Who writes
Wikipedia? 1
Image from: http://www.flickr.com/photos/ourcommon/480538715/
Augmented Social Cognition:Augmented Social Cognition:Who Edits Wikipedia?Who Edits Wikipedia?
Ed H. Chi
Augmented Social Cognition Area
Palo Alto Research Center
2007-06-17Ed H. Chi - Who writes
Wikipedia? 2
WikipediaWikipedia
2007-06-17Ed H. Chi - Who writes
Wikipedia? 3
High-end of the collaboration spectrumHigh-end of the collaboration spectrum Groups utilize systems to
make sense and share complex topics and materials.
Wikipedia (social status) Slashdot (karma points) eHow.com Lostpedia.com
2007-06-17Ed H. Chi - Who writes
Wikipedia? 4
Middle of the spectrumMiddle of the spectrum
Systems that evolve structures that can be used to organize information.
Del.icio.us Flickr YouTube Friendster
2007-06-17Ed H. Chi - Who writes
Wikipedia? 5
Lightweight social processesLightweight social processes Counting votes
– A way to increase signal-to-noise ratio– Information faddishness
Examples:– Digg.com– Most bookmarked items on del.icio.us
– Estimating the weight of an ox or temperature of a room
– The true value of a stock
– PageRank or Hub / Authority algorithms
2007-06-17Ed H. Chi - Who writes
Wikipedia? 6
Layers of Models NeededLayers of Models Needed
Heavier collaboration
Voting systems
Digg.com
Collaborative Creation
Wikipedia
Col. Information Structures
Slashdot
eHow.com
Del.icio.us
IBM dogearPageRank
Flickr
Understanding of micro-economics
• of foraging [PARC]
• Personal vs. group [Huberman,
Adamic]
• Wisdom of Crowd [Surowieki]
• Information cascades [Anderson
and Holt]
Understanding of conflicts and coordination
• Wikipedia coordination costs [PARC]
• Invisible Colleges [Sandstrom]• Interference effects [Pirolli]• Co-laboratories [Olson and Olson]• Community networks / Col.
Problem solving [Carroll]
Understanding of info and social networks
• Tag network analysis
[PARC, Golder, Yahoo]
• Structural holes (info brokerage) [Burt]
• Network constraints and structure
[various]
• Semantic of semiotic structures /
words [IR, LSA]
2007-06-17Ed H. Chi - Who writes
Wikipedia? 7
Research VisionResearch Vision
Augmented Social CognitionAugmented Social Cognition Cognition: the ability to remember, think, and reason; the faculty
of knowing. Social Cognition: the ability of a group to remember, think, and
reason; the construction of knowledge structures by a group.– (not quite the same as in the branch of psychology that studies the
cognitive processes involved in social interaction, though included) Augmented Social Cognition: Supported by systems, the
enhancement of the ability of a group to remember, think, and reason; the system-supported construction of knowledge structures by a group.
2007-06-17Ed H. Chi - Who writes
Wikipedia? 8
The first step in solving any The first step in solving any interesting problem is to get some interesting problem is to get some paper and pencil.paper and pencil.
John Tukey
(not a direct quote)
2007-06-17Ed H. Chi - Who writes
Wikipedia? 9
Increasing Coordination Cost in WikipediaIncreasing Coordination Cost in Wikipedia
(joint work with Niki Kittur, Bongwon Suh, Bryan Pendleton)
Published in CHI2007 conference: Aniket Kittur, Bongwon Suh, Bryan Pendleton, Ed H. ChiHe Says, She Says: Conflict and Coordination in Wikipedia. In Proc. of ACM Conference on Human Factors in Computing Systems (CHI2007), pp. 453--462, April 2007. ACM
Press. San Jose, CA
2007-06-17Ed H. Chi - Who writes
Wikipedia? 10
What is Wikipedia?What is Wikipedia?
“ Wikipedia is the best thing ever. Anyone in the world can write anything they want about any subject, so you know
you’re getting the best possible information.” – Steve Carell, The Office
2007-06-17Ed H. Chi - Who writes
Wikipedia? 11
Increasing Coordination Costs in WikipediaIncreasing Coordination Costs in Wikipedia
Understanding coordination costs is vital for long-term viability of collaborative information environment
Data:– Entire dump on July 2, 2006– 58 million revisions– 4.7 million wiki pages– 2.4 million article pages– 800 gigabytes
2007-06-17Ed H. Chi - Who writes
Wikipedia? 12
Less direct workLess direct work Decrease in proportion of edits to article page
0.5
0.55
0.6
0.65
0.7
0.75
0.8
0.85
0.9
0.95
1
2001 2002 2003 2004 2005 2006
Edit proportion
70%
2007-06-17Ed H. Chi - Who writes
Wikipedia? 13
0
0.02
0.04
0.06
0.08
0.1
0.12
0.14
0.16
0.18
0.2
2001 2002 2003 2004 2005 2006
Edit Proportion
More indirect workMore indirect work Increase in proportion of edits to user talk
8%
2007-06-17Ed H. Chi - Who writes
Wikipedia? 14
More indirect workMore indirect work Increase in proportion of edits to user talk Increase in proportion of edits to procedure
0
0.02
0.04
0.06
0.08
0.1
0.12
0.14
0.16
0.18
0.2
2001 2002 2003 2004 2005 2006
Edit proportion
11%
2007-06-17Ed H. Chi - Who writes
Wikipedia? 15
More maintenance workMore maintenance work Increase in proportion of edits that are reverts
0
0.020.04
0.06
0.080.1
0.120.14
0.16
0.180.2
2001 2002 2003 2004 2005 2006
Edit proportion
7%
2007-06-17Ed H. Chi - Who writes
Wikipedia? 16
More wasted workMore wasted work Increase in proportion of edits that are reverts Increase in proportion of edits reverting vandalism
00.005
0.010.015
0.02
0.0250.03
2001 2002 2003 2004 2005
1-2%
2007-06-17Ed H. Chi - Who writes
Wikipedia? 17
Global levelGlobal level Conflict and coordination costs are growing
– Less direct work (articles)+ More indirect work (article talk, user, procedure)+ More maintenance work (reverts, vandalism)
60%
65%
70%
75%
80%
85%
90%
95%
100%
2001 2002 2003 2004 2005 2006
Percentage of total edits
Article
User
Article Talk
User Talk
Other
Maintenance
2007-06-17Ed H. Chi - Who writes
Wikipedia? 18
Conflict at the article levelConflict at the article level Conflict is growing at the global level We have some idea about where it is But what defines conflict at the local level? Build a characterization model of article conflict
– Identify metrics relevant to conflict– Automatically identify high-conflict articles
2007-06-17Ed H. Chi - Who writes
Wikipedia? 19
Measure of controversyMeasure of controversy “Controversial” tag
Use # revisions tagged controversial
2007-06-17Ed H. Chi - Who writes
Wikipedia? 20
Page metricsPage metrics Possible metrics for identifying conflict in articles
Metric type Page TypeRevisions (#) Article, talk, article/talk
Page length Article, talk, article/talk
Unique editors Article, talk, article/talk
Unique editors / revisions Article, talk
Links from other articles Article, talk
Links to other articles Article, talk
Anonymous edits (#, %) Article, talk
Administrator edits (#, %) Article, talk
Minor edits (#, %) Article, talk
Reverts (#, by unique editors)
Article
2007-06-17Ed H. Chi - Who writes
Wikipedia? 21
Performance: Cross-validationPerformance: Cross-validation 5x cross-validation, R2 = 0.897
0
1000
2000
3000
4000
5000
6000
7000
8000
9000
10000
0 1000 2000 3000 4000 5000 6000 7000 8000 9000 10000
Predicted controversial revisions
2007-06-17Ed H. Chi - Who writes
Wikipedia? 22
Performance: Cross-validationPerformance: Cross-validation 5x cross-validation, R2 = 0.897
0
1000
2000
3000
4000
5000
6000
7000
8000
9000
10000
0 1000 2000 3000 4000 5000 6000 7000 8000 9000 10000
Predicted controversial revisions
2007-06-17Ed H. Chi - Who writes
Wikipedia? 23
Determinants of conflictDeterminants of conflict
Revisions (talk)
Minor edits (talk)
Unique editors (talk)
Revisions (article)
Unique editors (article)
Anonymous edits (talk)
Anonymous edits (article)
Highly weighted features of conflict model:
2007-06-17Ed H. Chi - Who writes
Wikipedia? 24
Model Generalization and Model Generalization and Validation surveyValidation survey Applied model to untagged articles (100+ edits) Sampled range of predicted conflict scores Rated by expert Wikipedians Significantly correlated with predicted scores
– By rank correlation, p < 0.013 (Spearman’s rho) Validates characterization model
– Detects conflicts even for articles with no ground truth
Who edits Wikipedia?Who edits Wikipedia?% of edits made by administrators% of edits made by administrators
2007-06-17Ed H. Chi - Who writes
Wikipedia? 25
% of edits by 10k+ editors% of edits by 10k+ editors
2007-06-17Ed H. Chi - Who writes
Wikipedia? 26
Word changes made by adminsWord changes made by admins
2007-06-17Ed H. Chi - Who writes
Wikipedia? 27
Shifting user population in WikipediaShifting user population in Wikipedia(more and more bottom driven!)(more and more bottom driven!)
2007-06-17Ed H. Chi - Who writes
Wikipedia? 28
Proportion of edits made by top editors Proportion of edits made by top editors in Wikipediain Wikipedia
2007-06-17Ed H. Chi - Who writes
Wikipedia? 29
Long tail of participation in WikipediaLong tail of participation in Wikipedia
2007-06-17Ed H. Chi - Who writes
Wikipedia? 30
The participation architecture is a The participation architecture is a power lawpower law
2007-06-17Ed H. Chi - Who writes
Wikipedia? 31
Only 60% of top 1% editors stay around Only 60% of top 1% editors stay around month to month!month to month!
2007-06-17Ed H. Chi - Who writes
Wikipedia? 32
2007-06-17Ed H. Chi - Who writes
Wikipedia? 33
Living Laboratory:Living Laboratory:Prototyping Social Applications on Prototyping Social Applications on the Internetthe Internet
Create a Living Laboratory as a platform to develop, test, and market our innovations, and as a vehicle for creating collaborations and thought leadership.
2007-06-17Ed H. Chi - Who writes
Wikipedia? 34
WikiDashboardWikiDashboard
Joint work with
Bongwon Suh, Aniket Kittur, Bryan Pendleton
2007-06-17Ed H. Chi - Who writes
Wikipedia? 35
Risks for Using WikipediaRisks for Using Wikipedia Factual accuracy Motives of editors Uncertain expertise Volatility Spotty coverage Unproven/non-independent source
[Denning et al. 2005]
2007-06-17Ed H. Chi - Who writes
Wikipedia? 36
Social DashboardSocial Dashboard Social translucent for effective communication and collaboration
– Make socially significant information visible and salient
– Support awareness of the rules and constraints
– Accountability for actions
Wikis can be a prime candidate– Every edit is logged and retrievable
– WikiScanner.com
– WikiRage.com
– Intellipedia
[Erickson and Kellogg 2002]
2007-06-17Ed H. Chi - Who writes
Wikipedia? 37
WikiDashboardWikiDashboard Surfacing hidden social context to users For readers
– Any incidents in the past e.g. A sudden burst of edits?– Who are the editors?– What is their motivation / point of views / expertise / topics of
interest – Help them judging the quality/trustworthiness/usefulness of
an article For writers
– Measure expertise / contribution / reputation– Motivate them to be more active / responsible (?)
2007-06-17Ed H. Chi - Who writes
Wikipedia? 38
Article DashboardArticle Dashboard
2007-06-17Ed H. Chi - Who writes
Wikipedia? 39
User DashboardUser Dashboard
2007-06-17Ed H. Chi - Who writes
Wikipedia? 40
Drilling DownDrilling Down List of every edits that a user made Let readers examine each individual revision for validity, which is hard to
accomplish when only provided with aggregate visual summaries.
2007-06-17Ed H. Chi - Who writes
Wikipedia? 41
Image from: http://www.flickr.com/photos/ourcommon/480538715/
Augmented Social Cognition:Augmented Social Cognition:From Social Foraging to Social SensemakingFrom Social Foraging to Social Sensemaking
Research Vision: Understand how social computing systems enhance the ability of a group of people to remember, think, and reason.
Living Laboratory: Create breakthrough applications that harness collective intelligence to improve knowledge capture, transfer, and discovery.
Recommended