Ripples on the Web: Diffusion of Activity Bursts across Hyperlink Networks in Wikipedia

Preview:

Citation preview

Ripples on the Web: Diffusion of Activity Bursts across Hyperlink Networks in Wikipedia

Brian Keegan (@bkeegan)

Yu-Ru Lin (@rhodiuslin)

David Lazer (@davidlazer)

Sunbelt XXXIII

Hamburg, Germany

May 23, 2013

2

Theoretical motivations

• Information seeking and sense-making• What kinds of information is general population

seeking following disaster?

• Mass convergence and crisis informatics• What implications does rapidly emerging information

have for emergency responders?

• Networks of knowledge and collaboration• How is this information verified and synthesized?

Twitter basically sucks• Information seeking and

sense-making• More noise & echo than signal,

fragmented behavior & commons

• Mass convergence and crisis informatics• Sampling & temporal censoring

• Networks of knowledge and collaboration• Unverifiable, misinformation, non-

cumulative

Wikipedia basically rules• Information seeking and

sense-making• Existing repertoires & activity

around contextual information

• Mass convergence and crisis informatics• Fine-grained & accessible history

• Networks of knowledge and collaboration• Cited, debated, and cumulative

account

Networks from Wikipedia data

• Markup• Hyperlinks: i has a link to j

• Revisions• Coauthorship: i shares an

editor with j

• Pageview activity• Correlation: i’s pageviews

correlated with j

7

Case study

Case study

• Boston Marathon bombings• Two distinct dates for burst of activity

related to major developments:• April 15: Bombing• April 19: Manhunt

• New information new articles bursting

9

Article dynamics – First 3 weeks

10

Dynamics – First 18 hours

Pageview dynamics

Pageview and editing coupling

Pageview and editing coupling

14

HYPERLINK NETWORK

Types of networks

• Markup• Hyperlinks: i has a link to j

• Revisions• Coauthorship: i shares an editor with j

• Pageview activity• Correlation: i’s pageviews correlated with j

16

Boston Marathonbombings

Boston Marathon

Watertown, Mass.

Boston, Mass.

Boylston St.

1 step

17

Boston Marathonbombings

Boston Marathon

Watertown, Mass.

Boston, Mass.

Boylston St.

1.5-step

Communities

Perpetrators

MIT PD

Watertown, Mass.

Shelter in place

Holy Cross

Pressure cooker

19

Burst detection

Rolling 30 day average

2x SE

20

Largest bursts

1. Ground stop (329)

2. Boylston Street (268)

3. Google Person Finder (237)

4. Patriots’ Day (201)

5. Copley Square (171)

6. Controlled explosion (168)

7. Lenox Hotel (116)

8. Pressure cooker (83)

9. MA EMA (83)

10. BP SOU (78)

21

April 15

April 16Pressure cooker

April 17

Holy Cross

April 18

MIT PD

Watertown, Mass.

Shelter in place

April 19

April 20

April 21

28

COAUTHORSHIP NETWORK

28

Types of networks

• Markup• Hyperlinks: i has a link to j

• Revisions• Coauthorship: i shares an editor with j

• Pageview activity• Correlation: i’s pageviews correlated with j

Coauthorship activity 4/15 – 5/1

30

31

CORRELATION NETWORKS

Types of networks

• Markup• Hyperlinks: i has a link to j

• Revisions• Coauthorship: i shares an editor with j

• Pageview activity• Correlation: i’s pageviews correlated with j

Temporal correlation networks

34

Duffel bag

New York Times MGH

Activity correlation network

36

DISCUSSION

Theoretical framework

• Information seeking and sense-making• Fine-grained traces of large-scale behavior in a

complex information space

• Mass convergence and crisis informatics• Nearly real-time behavior captures bursts of activity

related to current events

• Networks of knowledge and collaboration• Information seeking in knowledge network drives

creation of new knowledge and relationships

Future directions

• Track diffusion of bursts across larger hyperlink network• Are distant bursty events responsible for substantial

fraction of editing activity?

• Synchronized and anomalous bursts of activity as narrative elements • Czech Republic vs. Chechnya• Classifying events and mobilizing resources

Future directions

• Textual features predict bursts?• Edit distance, number of mentions, position on page,

etc. convey relatedness of content

• Multilevel & longitudinal statistical model of tie formation• Dyadic covariates: Pageview correlation

coauthorship ties hyperlinks

40

THANK YOU!

Brian Keegan

b.keegan@neu.edu

www.brianckeegan.com

@bkeegan

40