22
Quantitative Analysis of User-Generated Content on the Web Xavier Ochoa, ESPOL, Ecuador Erik Duval, KULeuven, Bélgica

Quantitative Analysis of User-Generated Content on the Web

Embed Size (px)

DESCRIPTION

Web Science Workshop at World Wide Web Conference 2008 Presentation that presents the results of measuring the user contribution to 9 UGC web-sites: Furl, Digg, Slideshare, FanFiction, Scribd, Revver, Merlot, Amazon Reviews and LibraryThing

Citation preview

Page 1: Quantitative Analysis of User-Generated Content on the Web

Quantitative Analysis of User-Generated Content

on the Web

Xavier Ochoa, ESPOL, EcuadorErik Duval, KULeuven, Bélgica

Page 2: Quantitative Analysis of User-Generated Content on the Web

TopicsWhy?StudiesFindings

Implication of the Findings

ConclusionFurterWork

Page 3: Quantitative Analysis of User-Generated Content on the Web

Why?

• UGC economy:– Supply: Users publishing their content

–Demand: Users viewing content from others

–Currency: Attention

Page 4: Quantitative Analysis of User-Generated Content on the Web

Why?

• Demand (Popularity) is relatively well understood:

• But Supply (Publication) is not....

How a ‘hit’ is born (S Sinha, RK Pan, 2006)

Page 5: Quantitative Analysis of User-Generated Content on the Web

Studies

Page 6: Quantitative Analysis of User-Generated Content on the Web

Studies

1. Descriptive Statistics

2. Distribution Fitting

3. Concentration Analysis

Page 7: Quantitative Analysis of User-Generated Content on the Web

Findings

• Distribution of supply is not Normal

Page 8: Quantitative Analysis of User-Generated Content on the Web

Findings

• Distribution of supply has a heavy tail

Page 9: Quantitative Analysis of User-Generated Content on the Web

Findings

Lotka (“fat-tail”) Weibull (“fat-belly”)

Page 10: Quantitative Analysis of User-Generated Content on the Web

Implications of the Findings

There is not such thing as an “average user”

Page 11: Quantitative Analysis of User-Generated Content on the Web

Low Class

Middle Class

High Class

Page 12: Quantitative Analysis of User-Generated Content on the Web

Implications of the Findings

The production of different UGC types is similar, but not

the same.

Page 13: Quantitative Analysis of User-Generated Content on the Web

Implications of the Findings

Pareto Rule (80/20) applies to UGC

(but no substitute to measuring)

Page 14: Quantitative Analysis of User-Generated Content on the Web

Implications of the Findings

“Fat-tail” UGC production is similar to professional

production.

Page 15: Quantitative Analysis of User-Generated Content on the Web

Implications of Findings

The distribution is not affected by site size

or production effort

Page 16: Quantitative Analysis of User-Generated Content on the Web

Implications of the Findings

Make your bet, head or tail?

Page 17: Quantitative Analysis of User-Generated Content on the Web

50% of Content is generated here

Page 18: Quantitative Analysis of User-Generated Content on the Web

50% of Content is generated here

Page 19: Quantitative Analysis of User-Generated Content on the Web

Implications of the Findings

Informetrics can help us to understand UGC production

(and vice versa)

Page 20: Quantitative Analysis of User-Generated Content on the Web

Conclusions

• Measuring is our only way to test our hypothesis about how Web works

• If you admin a UGC-based site, measure production to gain insight on the other side of your economy

• Inequality of Contribution of UGC is real and should be dealt with in all its variations.

Page 21: Quantitative Analysis of User-Generated Content on the Web

Further Work

• Modeling Production of UGC• Integrate UGC inside the Informetrics /

Scientometrics / Webometrics framework• Expand the data collection and analysis– Measure growth (size and contributors)– Measure production rate– Use at least 3 examples for each type of UGC

Page 22: Quantitative Analysis of User-Generated Content on the Web

Xie xie, questions?

Xavier Ochoa – [email protected] Duval – [email protected]