View
1.096
Download
1
Category
Preview:
DESCRIPTION
Presentation done at the 4th research methods festival - July 5th 2010 - Oxford - St. Catrine College
Citation preview
SIGSNA: Special Interest Group on Social Network Analysis Luca Rossi - Fabio GigliettoUniversity of Urbino “Carlo BO”
persistence/easy to search/scalability/ easy to replicate/(boyd 2007)
Background:Growing availability of User Generated Content
High research value of spontaneously produced contents.
persistence scalabilty replicability searchability
few Writings Printing press, newspapers
Digital media (pc, video-cameras)
World Wide Web + Google (Google Book Search)
many Writings Personal online publishing / Web 2.0 (Blogs, Flickr, YouTube)
Digital media (pc, video-cameras)
World Wide Web + Google (Google Blog Search)
The Big (online) Data: New opportunities
- high value of UGCs- huge amount of spontaneous data- large variety of topics- worldwide phenomenon (comparative analysis)
The Big Data: New methodological problems
- getting the data- storing the data - querying the data- analysing the data
} Interdisciplinary approach
© F
lickr
.com
/ G
eekM
om
Heath
er
Getting the data:RSS feeds (content produced) or API (users info).
Last.FM, Twitter, Flickr, Digg, Netlog, YouTube, MySpace…
Contacts, status, profile, TopUsed…
© F
lickr
.com
/ G
eekM
om
Heath
er
- Legal/ethical issues- Terms of use
Storing the data:
SIGSNA (two weeks of FriendFeed public data)10.500.000 posts (2GB text data). ≃
500.000 likes. ≃ 450.000 users. ≃
15 million subscriptions. ≃
© F
lickr
.com
/ a
man
ders
on2
from WOW20 to SIGSNA:Working with online user generated content for Sociological Research
WOW20 (2007) SIGSNA (2009)
Social Media Blogs FriendFeed
Type of data Public RSS feed Public RSS feed
Database Relational DB Relational DB
Extras Scraping tecniques Language identification
Amount of data 3000 blog entries 10.454.195 FF post*
* Entries and comments
Summary:
data cleaning
examples:
≠Heidi: 1974 Anime based on Johanna Spyri’s novel.
Heidi: 1973 Top Model
Querying the data Case study: SIGSNA research on breaking news propagation on Friendfeed
Mike Bongiorno (famous Italian TV host) died on Sept. 8 2010. The news stroke Friendfeed at 01.57 PM:- First entry >130 comments- All entries > 585 comments
How news propagate?What kind of behaviours?
Using timestamps and network of followers we have been able to track the propagation paths identifying major hubs.
Long propagation chains No propagation
Short propagation chains
Explicit news sharing is followed by chatting and discussion. This kind of activity contribute to news propagation
”Bye Mike! We’re missing you!Bye granpa Mike!Mike, you’ve been a milestone of our TV
“
First entry has the highest informative function
Most commented entry is a long and articulated discussion
More info, papers and data:http://larica.uniurb.it/sigsna
SIGSNA is a joint research project with the department of Computer Science of the University of Bologna (Dr. Matteo Magnani) and it is partially founded by Telecom Italia.
Recommended