48
Analyzing structure vs analyzing content Short sidestep: Agenda setting and Framing Structure Content Conc Fundamentals of Data Science: Case “Political Communication” Damian Trilling [email protected] @damian0604 www.damiantrilling.net Afdeling Communicatiewetenschap Universiteit van Amsterdam 19-09-2016 Fundamentals of Data Science: Case “Political Communication” Damian Trilling

Data Science: Case "Political Communication 2/2"

Embed Size (px)

Citation preview

Page 1: Data Science: Case "Political Communication 2/2"

Analyzing structure vs analyzing content Short sidestep: Agenda setting and Framing Structure Content Conclusion

Fundamentals of Data Science: Case “PoliticalCommunication”

Damian Trilling

[email protected]@damian0604

www.damiantrilling.net

Afdeling CommunicatiewetenschapUniversiteit van Amsterdam

19-09-2016

Fundamentals of Data Science: Case “Political Communication” Damian Trilling

Page 2: Data Science: Case "Political Communication 2/2"

Analyzing structure vs analyzing content Short sidestep: Agenda setting and Framing Structure Content Conclusion

Last week

1 some themes in political communication research• polarization• fragmentation• and the way politicans use social media

2 Twitter API, preprocessing, geodata, sentiment analysis

Fundamentals of Data Science: Case “Political Communication” Damian Trilling

Page 3: Data Science: Case "Political Communication 2/2"

Analyzing structure vs analyzing content Short sidestep: Agenda setting and Framing Structure Content Conclusion

This week

Digging deeper into the content of the tweets

Fundamentals of Data Science: Case “Political Communication” Damian Trilling

Page 4: Data Science: Case "Political Communication 2/2"

Analyzing structure vs analyzing content Short sidestep: Agenda setting and Framing Structure Content Conclusion

Today

1 Analyzing structure vs analyzing content

2 Short sidestep: Agenda setting and Framing

3 Studies that analyze structure of the Twittersphere

4 Studies that analyze content of tweetsIssuesResponses to TV debatesIncivility

5 Conclusion

Fundamentals of Data Science: Case “Political Communication” Damian Trilling

Page 5: Data Science: Case "Political Communication 2/2"

Analyzing structure vs analyzing content Short sidestep: Agenda setting and Framing Structure Content Conclusion

Structure

Analyzing structure vs analyzing content

Fundamentals of Data Science: Case “Political Communication” Damian Trilling

Page 6: Data Science: Case "Political Communication 2/2"

Analyzing structure vs analyzing content Short sidestep: Agenda setting and Framing Structure Content Conclusion

Structure

Analyzing Twitter data

Analyzing the structure

• Number of Tweets over time• singleton/retweet ratio• Distribution of number of Tweets per user• Interaction networks

⇒ Focus on the amount of content and on the question whointeracts with whom, not on what is said

Bruns, A., & Stieglitz, S. (2013). Toward more systematic Twitter analysis: Metrics for tweeting activities.International Journal of Social Research Methodology. doi:10.1080/13645579.2012.756095

Fundamentals of Data Science: Case “Political Communication” Damian Trilling

Page 7: Data Science: Case "Political Communication 2/2"

Analyzing structure vs analyzing content Short sidestep: Agenda setting and Framing Structure Content Conclusion

Structure

Analyzing Twitter data

Analyzing the structure

• Number of Tweets over time• singleton/retweet ratio• Distribution of number of Tweets per user• Interaction networks

⇒ Focus on the amount of content and on the question whointeracts with whom, not on what is said

Bruns, A., & Stieglitz, S. (2013). Toward more systematic Twitter analysis: Metrics for tweeting activities.International Journal of Social Research Methodology. doi:10.1080/13645579.2012.756095

Fundamentals of Data Science: Case “Political Communication” Damian Trilling

Page 8: Data Science: Case "Political Communication 2/2"

Analyzing structure vs analyzing content Short sidestep: Agenda setting and Framing Structure Content Conclusion

Content

Analyzing Twitter data

Analyzing the content

• Sentiment analysis• Word frequencies• regexp searches• Word cooccurrences (⇒topics, frames)

• co-occurrence networks• PCA• LDA• . . .

⇒ Focus on what is said

Fundamentals of Data Science: Case “Political Communication” Damian Trilling

Page 9: Data Science: Case "Political Communication 2/2"

Analyzing structure vs analyzing content Short sidestep: Agenda setting and Framing Structure Content Conclusion

Content

Analyzing Twitter data

Analyzing the content

• Sentiment analysis• Word frequencies• regexp searches• Word cooccurrences (⇒topics, frames)

• co-occurrence networks• PCA• LDA• . . .

⇒ Focus on what is said

Fundamentals of Data Science: Case “Political Communication” Damian Trilling

Page 10: Data Science: Case "Political Communication 2/2"

Analyzing structure vs analyzing content Short sidestep: Agenda setting and Framing Structure Content Conclusion

Content

Systematizing analytical approaches

⇒ It depends on your reserach question which approach ismore interesting!

But probably the most interesting thing is to combine themboth

Fundamentals of Data Science: Case “Political Communication” Damian Trilling

Page 11: Data Science: Case "Political Communication 2/2"

Analyzing structure vs analyzing content Short sidestep: Agenda setting and Framing Structure Content Conclusion

Content

Systematizing analytical approaches

⇒ It depends on your reserach question which approach ismore interesting!

But probably the most interesting thing is to combine themboth

Fundamentals of Data Science: Case “Political Communication” Damian Trilling

Page 12: Data Science: Case "Political Communication 2/2"

Short sidestep:Agenda setting and Framing

Page 13: Data Science: Case "Political Communication 2/2"

Analyzing structure vs analyzing content Short sidestep: Agenda setting and Framing Structure Content Conclusion

Agenda setting

Beyond simplistic stimulus-response models of media effects:

Media effects are not so much abouthow we think, but what we thinkaboutMcCombs, M, & Shaw, D (1972). The agenda-setting function of mass media. Public Opinion Quarterly, 36: 176.doi:10.1086/267990

Fundamentals of Data Science: Case “Political Communication” Damian Trilling

Page 14: Data Science: Case "Political Communication 2/2"

Analyzing structure vs analyzing content Short sidestep: Agenda setting and Framing Structure Content Conclusion

Framing

“To frame is to select some aspects of a perceived reality andmake them more salient in a communicating text, in such a way asto promote a particular problem definition, causal interpretation,moral evaluation, and/or treatment recommendation for the itemdescribed”Entman, R. M. (1993). Framing: Toward clarification of a fractured paradigm. Journal of Communication, 43,51–58. doi:10.1111/j.1460-2466.1993.tb01304.x

Fundamentals of Data Science: Case “Political Communication” Damian Trilling

Page 15: Data Science: Case "Political Communication 2/2"

Studies that analyze structure of the Twittersphere

Page 16: Data Science: Case "Political Communication 2/2"

Analyzing structure vs analyzing content Short sidestep: Agenda setting and Framing Structure Content Conclusion

The Twittersphere

Mapping the Austrian Twittersphere

• How do politicians, journalists, and citizens interact?• How do topics between news coverage and tweets overlap?(already content)

Ausserhofer, J., & Maireder, A. (2013). National Politics on Twitter. Information, Communication & Society,16(3), 291—314. doi:10.1080/1369118X

Fundamentals of Data Science: Case “Political Communication” Damian Trilling

Page 17: Data Science: Case "Political Communication 2/2"
Page 18: Data Science: Case "Political Communication 2/2"
Page 19: Data Science: Case "Political Communication 2/2"

“In general, famous journalists, experts and politicians are centralactors within the Austrian political Twittersphere and form theirown, dense and influential subnetwork within the broader sphere.Non-professionals may participate in this network, provided thatthey engage receptive members of the elite who act as ‘bridges’between subnetworks. However, when the discussion involvescertain topics, niche authorities emerge, and these authorities –including a few left-wing activists and bloggers – join otherpolitical professionals as central information hubs.”

Ausserhofer & Maireder 2013, p. 19

Page 20: Data Science: Case "Political Communication 2/2"
Page 21: Data Science: Case "Political Communication 2/2"

“While topics such as the financial crisis were massivelyrepresented in the newspapers and on TV, hardly anyone tweetedabout such topics on Twitter. A similar phenomenon could beobserved with the ongoing coverage of corruption-relatedinvestigations, about which only a few users bothered to tweet.Short-living topics such as the aforementioned ball of theright-wing fraternities and the squatting of an abandoned houseand the forced eviction of its ‘residents’ were popular topics onTwitter. A further explanation of why these topics are morepopular on Twitter than in mass media is that activists use theservice not only to discuss but also to facilitate their activities.”

Ausserhofer & Maireder 2013, p. 19

Page 22: Data Science: Case "Political Communication 2/2"

Studies that analyze the content of the tweets

Page 23: Data Science: Case "Political Communication 2/2"

Analyzing structure vs analyzing content Short sidestep: Agenda setting and Framing Structure Content Conclusion

Issues

Networks of issues

• Which topics are co-mentioned by the same users?• Which topics are co-mentioned by different types of accounts?

• Sentistrength + in combination with Obama/Romney todetermine who supports whom

• simple keyword searches (dictionary-approach) for topicclassification

• network analysis

Vargo, C. J., Guo, L., McCombs, M., & Shaw, D. L. (2014). Network Issue Agendas on Twitter During the 2012U.S. Presidential Election. Journal of Communication, 64, 296–316. doi:10.1111/jcom.12089

Fundamentals of Data Science: Case “Political Communication” Damian Trilling

Page 24: Data Science: Case "Political Communication 2/2"

Analyzing structure vs analyzing content Short sidestep: Agenda setting and Framing Structure Content Conclusion

Issues

Networks of issues

• Which topics are co-mentioned by the same users?• Which topics are co-mentioned by different types of accounts?

• Sentistrength + in combination with Obama/Romney todetermine who supports whom

• simple keyword searches (dictionary-approach) for topicclassification

• network analysis

Vargo, C. J., Guo, L., McCombs, M., & Shaw, D. L. (2014). Network Issue Agendas on Twitter During the 2012U.S. Presidential Election. Journal of Communication, 64, 296–316. doi:10.1111/jcom.12089

Fundamentals of Data Science: Case “Political Communication” Damian Trilling

Page 25: Data Science: Case "Political Communication 2/2"

Analyzing structure vs analyzing content Short sidestep: Agenda setting and Framing Structure Content Conclusion

Issues

Networks of issues

• General-interest media issue network predicts issue network ofObama supporters

• Partisan media issue network predicts issue network ofRomney supporters

Vargo, C. J., Guo, L., McCombs, M., & Shaw, D. L. (2014). Network Issue Agendas on Twitter During the 2012U.S. Presidential Election. Journal of Communication, 64, 296–316. doi:10.1111/jcom.12089

Fundamentals of Data Science: Case “Political Communication” Damian Trilling

Page 26: Data Science: Case "Political Communication 2/2"

Analyzing structure vs analyzing content Short sidestep: Agenda setting and Framing Structure Content Conclusion

Responses to TV debates

Second Screen

• Linking events to Twitter reactions• Linking candidate behavior to Twitter reactions

Central questionHow do people react to TV debates?

Vergeer, M., & Franses, P. H. (2015). Live audience responses to live televised election debates: Time seriesanalysis of issue salience and party salience on audience behavior. Information, Communication & Societydoi:10.1080/1369118X.2015.1093526

Trilling, D. (2015). Two different debates? Investigating the relationship between a political debate on TV andsimultaneous comments on Twitter. Social Science Computer Review, 33(3), 259–276.doi:10.1177/0894439314537886

Yıldırım, A., Üsküdarlı, S., & Özgür, A. (2016). Identifying Topics in Microblogs Using Wikipedia. Plos One,11(3), e0151885. doi:10.1371/journal.pone.0151885

Fundamentals of Data Science: Case “Political Communication” Damian Trilling

Page 27: Data Science: Case "Political Communication 2/2"

Analyzing structure vs analyzing content Short sidestep: Agenda setting and Framing Structure Content Conclusion

Responses to TV debates

Second Screen

• Linking events to Twitter reactions• Linking candidate behavior to Twitter reactions

Central questionHow do people react to TV debates?

Vergeer, M., & Franses, P. H. (2015). Live audience responses to live televised election debates: Time seriesanalysis of issue salience and party salience on audience behavior. Information, Communication & Societydoi:10.1080/1369118X.2015.1093526

Trilling, D. (2015). Two different debates? Investigating the relationship between a political debate on TV andsimultaneous comments on Twitter. Social Science Computer Review, 33(3), 259–276.doi:10.1177/0894439314537886

Yıldırım, A., Üsküdarlı, S., & Özgür, A. (2016). Identifying Topics in Microblogs Using Wikipedia. Plos One, 11(3),e0151885. doi:10.1371/journal.pone.0151885

Fundamentals of Data Science: Case “Political Communication” Damian Trilling

Page 28: Data Science: Case "Political Communication 2/2"

Example 1:relating word frequencies to each other

Trilling, D. (2015). Two different debates? Investigating the relationship between a political debate on TV andsimultaneous comments on Twitter. Social Science Computer Review, 33(3), 259–276.doi:10.1177/0894439314537886

Page 29: Data Science: Case "Political Communication 2/2"
Page 30: Data Science: Case "Political Communication 2/2"
Page 31: Data Science: Case "Political Communication 2/2"
Page 32: Data Science: Case "Political Communication 2/2"

Analyzing structure vs analyzing content Short sidestep: Agenda setting and Framing Structure Content Conclusion

Responses to TV debates

A way of visualizing this

font size ∼ relative frequency within copusdistance to y-axis ∼ log-likelikelihood (= difference between corpora)

Fundamentals of Data Science: Case “Political Communication” Damian Trilling

Page 33: Data Science: Case "Political Communication 2/2"

Example 2:manually classify most frequent terms into topics,subsequent time series analysis

Vergeer, M., & Franses, P. H. (2015). Live audience responses to live televised election debates: Time seriesanalysis of issue salience and party salience on audience behavior. Information, Communication & Society.doi:10.1080/1369118X.2015.1093526

Page 34: Data Science: Case "Political Communication 2/2"
Page 35: Data Science: Case "Political Communication 2/2"
Page 36: Data Science: Case "Political Communication 2/2"

Example 3:Using external datasource (wikipedia) for topic classification

Yıldırım, A., Üsküdarlı, S., & Özgür, A. (2016). Identifying Topics in Microblogs Using Wikipedia. Plos One, 11(3),e0151885. doi:10.1371/journal.pone.0151885

Page 37: Data Science: Case "Political Communication 2/2"
Page 38: Data Science: Case "Political Communication 2/2"
Page 39: Data Science: Case "Political Communication 2/2"

Analyzing structure vs analyzing content Short sidestep: Agenda setting and Framing Structure Content Conclusion

Incivility

Incivility

Who uses incivil language on Twitter?

incivility(1) name-calling; (2) threats; (3) vulgarities; (4) abusive or foullanguage; (5) xenophobia; (6) hateful language, epithets, or slurs;(7) racist or bigoted sentiments; (8) disparaging comments on thebasis of race/ethnicity; and (9) use of stereotypes

dictionary approach, based on existing word lists

Vargo, C. J., & Hopp, T. (2015). Socioeconomic status, social capital, and partisan polarity as predictors ofpolitical incivility on Twitter: A congressional district-level analysis. Social Science Computer Reviewdoi:10.1177/0894439315602858

Fundamentals of Data Science: Case “Political Communication” Damian Trilling

Page 40: Data Science: Case "Political Communication 2/2"

Analyzing structure vs analyzing content Short sidestep: Agenda setting and Framing Structure Content Conclusion

Incivility

Incivility

Who uses incivil language on Twitter?

incivility(1) name-calling; (2) threats; (3) vulgarities; (4) abusive or foullanguage; (5) xenophobia; (6) hateful language, epithets, or slurs;(7) racist or bigoted sentiments; (8) disparaging comments on thebasis of race/ethnicity; and (9) use of stereotypes

dictionary approach, based on existing word lists

Vargo, C. J., & Hopp, T. (2015). Socioeconomic status, social capital, and partisan polarity as predictors ofpolitical incivility on Twitter: A congressional district-level analysis. Social Science Computer Reviewdoi:10.1177/0894439315602858

Fundamentals of Data Science: Case “Political Communication” Damian Trilling

Page 41: Data Science: Case "Political Communication 2/2"

Analyzing structure vs analyzing content Short sidestep: Agenda setting and Framing Structure Content Conclusion

Incivility

Incivility

Who uses incivil language on Twitter?

incivility(1) name-calling; (2) threats; (3) vulgarities; (4) abusive or foullanguage; (5) xenophobia; (6) hateful language, epithets, or slurs;(7) racist or bigoted sentiments; (8) disparaging comments on thebasis of race/ethnicity; and (9) use of stereotypes

dictionary approach, based on existing word lists

Vargo, C. J., & Hopp, T. (2015). Socioeconomic status, social capital, and partisan polarity as predictors ofpolitical incivility on Twitter: A congressional district-level analysis. Social Science Computer Reviewdoi:10.1177/0894439315602858

Fundamentals of Data Science: Case “Political Communication” Damian Trilling

Page 42: Data Science: Case "Political Communication 2/2"

Analyzing structure vs analyzing content Short sidestep: Agenda setting and Framing Structure Content Conclusion

Incivility

Incivility

The central questionDo factors that are thought to be indicators of a functioningdemocratic discourse (like low polarization) translate to a civildiscourse on social media?

Vargo, C. J., & Hopp, T. (2015). Socioeconomic status, social capital, and partisan polarity as predictors ofpolitical incivility on Twitter: A congressional district-level analysis. Social Science Computer Reviewdoi:10.1177/0894439315602858

Fundamentals of Data Science: Case “Political Communication” Damian Trilling

Page 43: Data Science: Case "Political Communication 2/2"
Page 44: Data Science: Case "Political Communication 2/2"

“Our results suggested that uncivil discourse was highest indistricts that were characterized, in part, by factors traditionallythought to be indicative of a healthy and diverse democracy (i.e.,low levels of partisan polarity and high levels of racial diversity).”

“Notably, we failed to either fully or partially support a number ofour hypotheses.”

Vargo & Hopp, 2015, p. 17

Page 45: Data Science: Case "Political Communication 2/2"

“A number of limitations temper the present findings. First, thenature of the data severely limits the generalizability of ourfindings. The source of data here, Twitter, is, at best, aninstantaneous measure of behavior, not a durable measure ofemotion or feelings (Vieweg, 2010). Moreover, Twitter cannot bereasonably understood to be a directly reliable proxy for publicopinion in general. Also, the corpus here was limited to a specificevent, the 2012 general election. The messages gathered in thisanalysis were also directed at a specific political candidate (e.g.,Obama and/or Romney). While the findings still yield importantconclusions toward discourse, democracy, and general elections, wecannot use the current results to make generalizations about thestate of political discussion as a whole (either on or off of Twitter).”

Vargo & Hopp, 2015, p. 17

Page 46: Data Science: Case "Political Communication 2/2"

Remember:

This was just a tiny selection to give you some inspirationabout what one can research.

There are a bunch of other interesting studies and approaches.

Page 47: Data Science: Case "Political Communication 2/2"

Analyzing structure vs analyzing content Short sidestep: Agenda setting and Framing Structure Content Conclusion

Further reading

Jungherr, A. (2016). Twitter use in election campaigns: Asystematic literature review. Journal of Information Technology &Politics, 13(1), 72–91. doi:10.1080/19331681.2015.1132401

Fundamentals of Data Science: Case “Political Communication” Damian Trilling

Page 48: Data Science: Case "Political Communication 2/2"

Analyzing structure vs analyzing content Short sidestep: Agenda setting and Framing Structure Content Conclusion

Questions?

[email protected]@damian0604

www.damiantrilling.net

Fundamentals of Data Science: Case “Political Communication” Damian Trilling