measuring*polariza/on** on*social*media - ETH Z · measuring*polariza/on** on*social*media...

Preview:

Citation preview

measuring  polariza/on    on  social  media michael  mathioudakis

computer  science  aalto  university  

h6p://www.michalis.co

2  

social  media

Michael  Mathioudakis  

consume  content news  about  friends,  poli/cs,  favorite  ar/sts

generate  content share  experiences,  interes/ng  ar/cles

interact  with  others comment,  rate,  and  discuss

hundreds  of  millions of  ac/ve  users

why  study  social  media

a  window  into  the   thoughts  and  ac/ons  of  people

how  people  spend  their  /me what  a6racts  their  a6en/on

what  they  think  about  poli/cal  issues how  they  interact  socially

unique  opportunity  for  social  scien/sts

study  human  behavior at  large  scale  &  fine  detail

Michael  Mathioudakis   3  

but  you  are  a  computer  scien/st

develop  algorithms  to extract  insights  from  social  media

two  challenges

1-­‐  volume  of  data 2-­‐  complexity  of  social  behavior

calls  for  complex  models  (e.g.,  graphs) that  require  fast  algorithms

solu/on

algorithmic  methods,  soIware  tools  and  systems for  large-­‐scale  computa/on

analyze  efficiently  &  automa/cally  large  amounts  of  data

Michael  Mathioudakis   4  

5  

no  informa/on  barriers

ci/zen  journalism

social  connec/vity …

harassment

fake  news

polariza/on

social  media:  good  and  bad  sides

Michael  Mathioudakis  

polariza/on

poli/cal  or  social  polariza/on ‘the  act  of  separa-ng  or  making  people  separate  into  

two  groups  with  completely  opposite  opinions’*

related  term:  controversy

‘public  discussion  and  argument  about  something  that   many  people  strongly  disagree  about’*

*oxford  english  dic-onary

Michael  Mathioudakis   6  

why  study  polariza/on?

polariza/on  can  be  linked  to  adverse  effects

social  segmenta/on stereotypes

echo  chambers

goal understand  and  mi/gate  them

Michael  Mathioudakis   7  

8  

0  10  20  30  40  50  60  70  80  90  100  

Donald  Trump  Approval  Ra0ng  %  February  2017  

Democrats   Republicans  

Michael  Mathioudakis  

9  

ideology  of  USA  public source:  Pew  Research  Center

1994  

2014   Democrats  and  Republicans  

have  driIed  apart

Michael  Mathioudakis  

why  study  polariza/on  on  social  media

extract  insights  on   the  polariza/on  process

some  specific  ques/ons

what  are  the  polarizing  issues? does  polariza/on  increase  over  /me?  

we  saw  one  such  instance:  US  public how  could  we  ‘nudge’  people  away  from  

extreme  polariza/on? is  polariza/on  linked  to  echo  chambers? do  social  media  increase  polariza/on?

Michael  Mathioudakis   10  

this  

talk  

in  this  talk…

twi6er

algorithms  to  measure  it

long-­‐term  study

next  steps

11  

polariza/on

Michael  Mathioudakis  

12  

retweets  tweet  

replies  

connecCons  

interacCons  Michael  Mathioudakis  

twi6er

microblogging  plaXorm since  2006;  300  million  ac/ve  users users  post  short  messages  -­‐  ‘tweets’

13  Michael  Mathioudakis  

global  structure  of  interac/ons

nodes

14  

user  accounts

edges retweets replies connec/ons

interac/on  graph

Michael  Mathioudakis  

1%  sample  of  all  tweets from  the  internet  archive

data

in  what  follows…

15  Michael  Mathioudakis  

twi6er algorithms  to  measure  it

long-­‐term  study next  steps

polariza/on

Quan/fying  Controversy  in  Social  Media.  WSDM  2016. With  Kiran  Garimella,  Gianmarco  De  Francisci  Morales,  Aris/des  Gionis.  

how  polarized  is  a  discussion?

e.g.,  obamacare  in  January  2017

16  Michael  Mathioudakis  

how  could  we  approach  this  ques/on?

content? do  opposing  sides  say  different  things?

sen/ment? do  polarized  topics  exhibit  wider  

range  of  emo/ons?

interac/ons? do  people  interact  more with  their  own  side?

Michael  Mathioudakis   17  

let’s  try  this

method  template

build  an  interac/on  graph several  types  to  try

retweets,  replies,  connec/ons

is  the  interac/on  graph  polarized?

output:  polariza/on  score

18  

not  polarized polarized two  sides,  well  separated

Michael  Mathioudakis  

pipeline

19  Michael  Mathioudakis  

topic  

evaluaCo

n  

Graph Building

Graph Partitioning

Controversy Measure

PolarizaCon  

what  type  of  interac/on  graph  should  we  use?

how  to  find two  sides  in  the  

graph?

how  to  measure  the  separa/on  between  

two  sides?

do  we  iden/fy polarized  

discussions?

retweets replies

connec/ons

20  Michael  Mathioudakis  

topic  

evaluaCo

n  

Graph Building

Graph Partitioning

Controversy Measure

PolarizaCon  

what  type  of  interac/on  graph  should  we  use?

how  to  find two  sides  in  the  

graph?

how  to  measure  the  separa/on  between  

two  sides?

do  we  iden/fy polarized  

discussions?

21  

par//on  into  two  sides

Michael  Mathioudakis  

topic  

evaluaCo

n  

Graph Building

Graph Partitioning

Controversy Measure

PolarizaCon  

what  type  of  interac/on  graph  should  we  use?

how  to  find two  sides  in  the  

graph?

how  to  measure  the  separa/on  between  

two  sides?

do  we  iden/fy polarized  

discussions?

par//oning

many  exis/ng  algorithms spectral  clustering label  propaga/on me/s

Michael  Mathioudakis   22  

well  connected

badly  connected

random  walk  controversy  (RWC)

23  Michael  Mathioudakis  

topic  

evaluaCo

n  

Graph Building

Graph Partitioning

Controversy Measure

PolarizaCon  

what  type  of  interac/on  graph  should  we  use?

how  to  find two  sides  in  the  

graph?

how  to  measure  the  separa/on  between  

two  sides?

do  we  iden/fy polarized  

discussions?

random  walk  controversy  (RWC)

polariza/on  score how  difficult  to  reach  c  from  B

24  

A  and  B:  two  sides c  and  d:  central  nodes

and values  -­‐1  to  1

the  more  difficult the  higher  the  score how  difficult  to  reach  d  from  A

Michael  Mathioudakis  

c  d  

A   B  

random  walk

random  walk

25  Michael  Mathioudakis  

discrete  process  on  a  graph step  1,  2,  3,  …

at  each  step,  we  are  at  one  node

at  next  step,  we  move  to  a  nearby  node -­‐  following  an  edge  at  random

specify where  the  walk  starts  (e.g.,  at  c) where  the  walk  ends  (e.g.,  at  d)

c  d  

random  walk  controversy  (RWC)

26  

A  and  B:  two  sides c  and  d:  central  nodes

Michael  Mathioudakis  

c  d  

A   B  

random  walk start:  random  node  of  A  /  B 50%  chance  that  we  do  either end:  at  central  nodes c  of  side  A  or  d  of  side  B

PXY  =  P(started  in  X  |  ended  in  Y)

consider  two  independent  instances one  ending  at  the  central  node  of  side  A,  the  other  at  the  central  node  of  side  B

RWC  =  PAA  PBB    -­‐    PBA  PAB  

both  random  walks  started  in  the  side  where  they  ended  

both  random  walks  started  in  a  side    other  than  where  they  ended  

values?

random  walk  controversy  (RWC)

27  Michael  Mathioudakis  

topic  

evaluaCo

n  

Graph Building

Graph Partitioning

Controversy Measure

PolarizaCon  

what  type  of  interac/on  graph  should  we  use?

how  to  find two  sides  in  the  

graph?

how  to  measure  the  separa/on  between  

two  sides?

do  we  iden/fy polarized  

discussions?

28  Michael  Mathioudakis  

topic  

evaluaCo

n  

Graph Building

Graph Partitioning

Controversy Measure

PolarizaCon  

what  type  of  interac/on  graph  should  we  use?

how  to  find two  sides  in  the  

graph?

how  to  measure  the  separa/on  between  

two  sides?

do  we  iden/fy polarized  

discussions?

evalua/on

non-­‐polarized  topics

indian  beelan,  nemtsov  protests,   netanyahu  US  congress  speech,

bal/more  riots,  ukraine polarized  topics

germanwings  plane  crash,  sxsw,  mother’s  day,  jurassic  world  movie,  

na/onal  kissing  day

29  Michael  Mathioudakis  

popular  topics  of  2015  graphs  with  1  –  150  thousand  nodes

examples:

Graph Building

Graph Partitioning

Controversy Measure

PolarizaCon  

results

Michael  Mathioudakis   30  

does  the  pipeline  dis/nguish   polarized  from  non-­‐polarized  topics?

retweets replies

connec/ons

two  sides random  walk  controversy  (RWC)

topic

(if  we  use  retweets) yes!

results

nemtsov   protests

indian beef  ban  

sxsw conference  

germanwings   plane  crash

interac/on  graphs:  retweets 31  Michael  Mathioudakis  

polarized

 topics non-­‐polarized  topics

high  RWC   low  RWC  

results

32  

retweets replies

interac/on  graphs  for   nemtsov  protests

Michael  Mathioudakis  

an  algorithmic  way  to  quan/fy  polariza/on

Michael  Mathioudakis   33  

retweets two  sides random  walk  controversy  (RWC)

topic

based  on  structure  of  interac/ons language-­‐independent

Graph Building

Graph Partitioning

Controversy Measure

PolarizaCon  

new  method can  be  deployed  in  the  wild

in  what  follows…

34  Michael  Mathioudakis  

twi6er algorithms  to  measure  it

long-­‐term  study next  steps

polariza/on

Graph Building

Graph Partitioning

Controversy Measure

The  Ebb  and  Flow  of  Controversial  Debates  on  Social  Media.  ICWSM  2017.  The  Effect  of  Collec/ve  A6en/on  on  Controversial  Debates  on  Social  Media.  WebScience  2017. With  Kiran  Garimella,  Gianmarco  De  Francisci  Morales,  Aris/des  Gionis.  

PolarizaCon  

polariza/on  over  /me

has  polariza/on  around  controversial  topics increased  over  /me?

35  Michael  Mathioudakis  

polariza/on  over  /me

data 1%  sample  of  all  tweets from  the  internet  archive

September  2011  to  September  2016

method for  a  given  topic  (e.g.,  obamacare)

retrieve  related  tweets build  an  interac/on  graph  for  each  day

using  retweets measure  RWC  score

Michael  Mathioudakis   36  

RWC  vs  Time

37  Michael  Mathioudakis  

September 2011

September 2016

volume  of  ac/vity  spikes  at  major  events

38  Michael  Mathioudakis  

39  Michael  Mathioudakis  

does  polariza/on  spike  with  volume?  measure  rwc  vs  volume  of  ac/vity

40  Michael  Mathioudakis  

RWC  vs  Volume

higher  volume  

higher  con

troversy  

41  Michael  Mathioudakis  

higher  volume   higher  volume  

RWC  vs  Volume

Michael  Mathioudakis   42  

higher  volume  

higher  con

troversy  

rwc  

summary

43  Michael  Mathioudakis  

twi6er algorithms  to  measure  it

long-­‐term  study next  steps

polariza/on

why  study  polariza/on  on  social  media

extract  insights  on   the  polariza/on  process

some  specific  ques/ons

what  are  the  polarizing  issues? does  polariza/on  increase  over  /me?  

we  saw  one  such  instance:  US  public how  could  we  ‘nudge’  people  away  from  

extreme  polariza/on? is  polariza/on  linked  to  echo  chambers? do  social  media  increase  polariza/on?

Michael  Mathioudakis   44  

this  

talk  

next  steps  

thank you!

45  Michael  Mathioudakis  

Beanplot  of  RWC  on  retweet  networks

Michael  Mathioudakis   46  0.0

0.2

0.4

0.6

0.8

1.0

BCC EC GMCK MBLB RWC

Cont

rove

rsy

scor

e

ControversialNon−controversial

controversial  non-­‐controversial  RW

C  

me/s

par//oning  scheme george  karypis  et.al.

phase  1  

coarsening (random  or  connec/vity-­‐based)

phase  2 2-­‐way-­‐par//oning  phase

(spectral  clustering) phase  3

uncoarsening  &  refining  phase

Michael  Mathioudakis   47  

other  polariza/on  measures

betweenness  of  cross-­‐cusng  edges  (BCC) frac/on  of  shortest  paths  that  include  an  edge

embedding  distance

2d  embedding  produced  by  ForceAtlas2

48  

average  in-­‐cluster  distances

average  cross-­‐cluster  distance

Michael  Mathioudakis  

Polariza/on  scores  on  retweet  networks

0.0

0.2

0.4

0.6

0.8

1.0

BCC EC GMCK MBLB RWC

Con

trove

rsy

scor

e

ControversialNon−controversial

Michael  Mathioudakis   49  

…  beyond  polariza/on

develop  methods,  tools  and  systems to  help  social  scien/sts

study  social  ac/vity  on  the  web

computa/onal  social  science social  influence

understand  how  people  experience  their  ci/es track  poli/cal-­‐social-­‐ethnic  trends

Michael  Mathioudakis   50  

Recommended