Upload
others
View
2
Download
0
Embed Size (px)
Citation preview
measuring polariza/on on social media michael mathioudakis
computer science aalto university
h6p://www.michalis.co
2
social media
Michael Mathioudakis
consume content news about friends, poli/cs, favorite ar/sts
generate content share experiences, interes/ng ar/cles
interact with others comment, rate, and discuss
hundreds of millions of ac/ve users
why study social media
a window into the thoughts and ac/ons of people
how people spend their /me what a6racts their a6en/on
what they think about poli/cal issues how they interact socially
unique opportunity for social scien/sts
study human behavior at large scale & fine detail
Michael Mathioudakis 3
but you are a computer scien/st
develop algorithms to extract insights from social media
two challenges
1-‐ volume of data 2-‐ complexity of social behavior
calls for complex models (e.g., graphs) that require fast algorithms
solu/on
algorithmic methods, soIware tools and systems for large-‐scale computa/on
analyze efficiently & automa/cally large amounts of data
Michael Mathioudakis 4
5
no informa/on barriers
ci/zen journalism
social connec/vity …
…
harassment
fake news
polariza/on
social media: good and bad sides
Michael Mathioudakis
polariza/on
poli/cal or social polariza/on ‘the act of separa-ng or making people separate into
two groups with completely opposite opinions’*
related term: controversy
‘public discussion and argument about something that many people strongly disagree about’*
*oxford english dic-onary
Michael Mathioudakis 6
why study polariza/on?
polariza/on can be linked to adverse effects
social segmenta/on stereotypes
echo chambers
goal understand and mi/gate them
Michael Mathioudakis 7
8
0 10 20 30 40 50 60 70 80 90 100
Donald Trump Approval Ra0ng % February 2017
Democrats Republicans
Michael Mathioudakis
9
ideology of USA public source: Pew Research Center
1994
2014 Democrats and Republicans
have driIed apart
Michael Mathioudakis
why study polariza/on on social media
extract insights on the polariza/on process
some specific ques/ons
what are the polarizing issues? does polariza/on increase over /me?
we saw one such instance: US public how could we ‘nudge’ people away from
extreme polariza/on? is polariza/on linked to echo chambers? do social media increase polariza/on?
Michael Mathioudakis 10
this
talk
in this talk…
twi6er
algorithms to measure it
long-‐term study
next steps
11
polariza/on
Michael Mathioudakis
12
retweets tweet
replies
connecCons
interacCons Michael Mathioudakis
twi6er
microblogging plaXorm since 2006; 300 million ac/ve users users post short messages -‐ ‘tweets’
13 Michael Mathioudakis
global structure of interac/ons
nodes
14
user accounts
edges retweets replies connec/ons
interac/on graph
Michael Mathioudakis
1% sample of all tweets from the internet archive
data
in what follows…
15 Michael Mathioudakis
twi6er algorithms to measure it
long-‐term study next steps
polariza/on
Quan/fying Controversy in Social Media. WSDM 2016. With Kiran Garimella, Gianmarco De Francisci Morales, Aris/des Gionis.
how polarized is a discussion?
e.g., obamacare in January 2017
16 Michael Mathioudakis
how could we approach this ques/on?
content? do opposing sides say different things?
sen/ment? do polarized topics exhibit wider
range of emo/ons?
interac/ons? do people interact more with their own side?
Michael Mathioudakis 17
let’s try this
method template
build an interac/on graph several types to try
retweets, replies, connec/ons
is the interac/on graph polarized?
output: polariza/on score
18
not polarized polarized two sides, well separated
Michael Mathioudakis
pipeline
19 Michael Mathioudakis
topic
evaluaCo
n
Graph Building
Graph Partitioning
Controversy Measure
PolarizaCon
what type of interac/on graph should we use?
how to find two sides in the
graph?
how to measure the separa/on between
two sides?
do we iden/fy polarized
discussions?
retweets replies
connec/ons
20 Michael Mathioudakis
topic
evaluaCo
n
Graph Building
Graph Partitioning
Controversy Measure
PolarizaCon
what type of interac/on graph should we use?
how to find two sides in the
graph?
how to measure the separa/on between
two sides?
do we iden/fy polarized
discussions?
21
par//on into two sides
Michael Mathioudakis
topic
evaluaCo
n
Graph Building
Graph Partitioning
Controversy Measure
PolarizaCon
what type of interac/on graph should we use?
how to find two sides in the
graph?
how to measure the separa/on between
two sides?
do we iden/fy polarized
discussions?
par//oning
many exis/ng algorithms spectral clustering label propaga/on me/s
Michael Mathioudakis 22
well connected
badly connected
random walk controversy (RWC)
23 Michael Mathioudakis
topic
evaluaCo
n
Graph Building
Graph Partitioning
Controversy Measure
PolarizaCon
what type of interac/on graph should we use?
how to find two sides in the
graph?
how to measure the separa/on between
two sides?
do we iden/fy polarized
discussions?
random walk controversy (RWC)
polariza/on score how difficult to reach c from B
24
A and B: two sides c and d: central nodes
and values -‐1 to 1
the more difficult the higher the score how difficult to reach d from A
Michael Mathioudakis
c d
A B
random walk
random walk
25 Michael Mathioudakis
discrete process on a graph step 1, 2, 3, …
at each step, we are at one node
at next step, we move to a nearby node -‐ following an edge at random
specify where the walk starts (e.g., at c) where the walk ends (e.g., at d)
c d
random walk controversy (RWC)
26
A and B: two sides c and d: central nodes
Michael Mathioudakis
c d
A B
random walk start: random node of A / B 50% chance that we do either end: at central nodes c of side A or d of side B
PXY = P(started in X | ended in Y)
consider two independent instances one ending at the central node of side A, the other at the central node of side B
RWC = PAA PBB -‐ PBA PAB
both random walks started in the side where they ended
both random walks started in a side other than where they ended
values?
random walk controversy (RWC)
27 Michael Mathioudakis
topic
evaluaCo
n
Graph Building
Graph Partitioning
Controversy Measure
PolarizaCon
what type of interac/on graph should we use?
how to find two sides in the
graph?
how to measure the separa/on between
two sides?
do we iden/fy polarized
discussions?
28 Michael Mathioudakis
topic
evaluaCo
n
Graph Building
Graph Partitioning
Controversy Measure
PolarizaCon
what type of interac/on graph should we use?
how to find two sides in the
graph?
how to measure the separa/on between
two sides?
do we iden/fy polarized
discussions?
evalua/on
non-‐polarized topics
indian beelan, nemtsov protests, netanyahu US congress speech,
bal/more riots, ukraine polarized topics
germanwings plane crash, sxsw, mother’s day, jurassic world movie,
na/onal kissing day
29 Michael Mathioudakis
popular topics of 2015 graphs with 1 – 150 thousand nodes
examples:
Graph Building
Graph Partitioning
Controversy Measure
PolarizaCon
results
Michael Mathioudakis 30
does the pipeline dis/nguish polarized from non-‐polarized topics?
retweets replies
connec/ons
two sides random walk controversy (RWC)
topic
(if we use retweets) yes!
results
nemtsov protests
indian beef ban
sxsw conference
germanwings plane crash
interac/on graphs: retweets 31 Michael Mathioudakis
polarized
topics non-‐polarized topics
high RWC low RWC
results
32
retweets replies
interac/on graphs for nemtsov protests
Michael Mathioudakis
an algorithmic way to quan/fy polariza/on
Michael Mathioudakis 33
retweets two sides random walk controversy (RWC)
topic
based on structure of interac/ons language-‐independent
Graph Building
Graph Partitioning
Controversy Measure
PolarizaCon
new method can be deployed in the wild
in what follows…
34 Michael Mathioudakis
twi6er algorithms to measure it
long-‐term study next steps
polariza/on
Graph Building
Graph Partitioning
Controversy Measure
The Ebb and Flow of Controversial Debates on Social Media. ICWSM 2017. The Effect of Collec/ve A6en/on on Controversial Debates on Social Media. WebScience 2017. With Kiran Garimella, Gianmarco De Francisci Morales, Aris/des Gionis.
PolarizaCon
polariza/on over /me
has polariza/on around controversial topics increased over /me?
35 Michael Mathioudakis
polariza/on over /me
data 1% sample of all tweets from the internet archive
September 2011 to September 2016
method for a given topic (e.g., obamacare)
retrieve related tweets build an interac/on graph for each day
using retweets measure RWC score
Michael Mathioudakis 36
RWC vs Time
37 Michael Mathioudakis
September 2011
September 2016
volume of ac/vity spikes at major events
38 Michael Mathioudakis
39 Michael Mathioudakis
does polariza/on spike with volume? measure rwc vs volume of ac/vity
40 Michael Mathioudakis
RWC vs Volume
higher volume
higher con
troversy
41 Michael Mathioudakis
higher volume higher volume
RWC vs Volume
Michael Mathioudakis 42
higher volume
higher con
troversy
rwc
summary
43 Michael Mathioudakis
twi6er algorithms to measure it
long-‐term study next steps
polariza/on
why study polariza/on on social media
extract insights on the polariza/on process
some specific ques/ons
what are the polarizing issues? does polariza/on increase over /me?
we saw one such instance: US public how could we ‘nudge’ people away from
extreme polariza/on? is polariza/on linked to echo chambers? do social media increase polariza/on?
Michael Mathioudakis 44
this
talk
next steps
thank you!
45 Michael Mathioudakis
Beanplot of RWC on retweet networks
Michael Mathioudakis 46 0.0
0.2
0.4
0.6
0.8
1.0
BCC EC GMCK MBLB RWC
Cont
rove
rsy
scor
e
ControversialNon−controversial
controversial non-‐controversial RW
C
me/s
par//oning scheme george karypis et.al.
phase 1
coarsening (random or connec/vity-‐based)
phase 2 2-‐way-‐par//oning phase
(spectral clustering) phase 3
uncoarsening & refining phase
Michael Mathioudakis 47
other polariza/on measures
betweenness of cross-‐cusng edges (BCC) frac/on of shortest paths that include an edge
embedding distance
2d embedding produced by ForceAtlas2
48
average in-‐cluster distances
average cross-‐cluster distance
Michael Mathioudakis
Polariza/on scores on retweet networks
0.0
0.2
0.4
0.6
0.8
1.0
BCC EC GMCK MBLB RWC
Con
trove
rsy
scor
e
ControversialNon−controversial
Michael Mathioudakis 49
… beyond polariza/on
develop methods, tools and systems to help social scien/sts
study social ac/vity on the web
computa/onal social science social influence
understand how people experience their ci/es track poli/cal-‐social-‐ethnic trends
Michael Mathioudakis 50