Analysis of Social Media MLD 10-802, LTI 11-772 William Cohen 1-25-010

Embed Size (px)

Citation preview

  • Slide 1
  • Analysis of Social Media MLD 10-802, LTI 11-772 William Cohen 1-25-010
  • Slide 2
  • Recap: What are we trying to do? Like the normal curve: Fit real-world data Find an underlying process that explains the data Enable mathematical understandingl (closed- form?) Modelssome small but interesting part of the data
  • Slide 3
  • Graphs Some common properties of graphs: Distribution of node degrees Distribution of cliques (e.g., triangles) Distribution of paths Diameter (max shortest- path) Effective diameter (90 th percentile) Connected components Some types of graphs to consider: Real graphs (social & otherwise) Generated graphs: Erdos-Renyi Bernoulli or Poisson Watts-Strogatz small world graphs Barbosi-Albert preferential attachment
  • Slide 4
  • Graphs Some types of graphs to consider: Real graphs (social & otherwise) Generated graphs: Erdos-Renyi Bernoulli or Poisson Watts-Strogatz small world graphs Barbosi-Albert preferential attachment All pairs connected with probability p
  • Slide 5
  • Graphs Some types of graphs to consider: Real graphs (social & otherwise) Generated graphs: Erdos-Renyi Bernoulli or Poisson Watts-Strogatz small world graphs Barbosi-Albert preferential attachment Regular, high-homophily lattice Plus random shortcut links
  • Slide 6
  • Graphs Some types of graphs to consider: Real graphs (social & otherwise) Generated graphs: Erdos-Renyi Bernoulli or Poisson Watts-Strogatz small world graphs Barbosi-Albert preferential attachment New nodes have m neighbors High-degree nodes are preferred Rich get richer
  • Slide 7
  • Graphs Some common properties of graphs: Distribution of node degrees Distribution of cliques (e.g., triangles) Distribution of paths Diameter (max shortest- path) Effective diameter (90 th percentile) Connected components Some types of graphs to consider: Real graphs (social & otherwise) Generated graphs: Erdos-Renyi Bernoulli or Poisson Watts-Strogatz small world graphs Barbosi-Albert preferential attachment
  • Slide 8
  • Graphs Some common properties of graphs: Distribution of node degrees Distribution of cliques (e.g., triangles) Distribution of paths Diameter (max shortest- path) Effective diameter (90 th percentile) Connected components
  • Slide 9
  • Graphs Some common properties of graphs: Distribution of node degrees Distribution of cliques (e.g., triangles) Distribution of paths Diameter (max shortest- path) Effective diameter (90 th percentile) Connected components In a big Erdos-Renyi graph this is very small (1/n) In social graphs, not so much More later
  • Slide 10
  • Graphs Some common properties of graphs: Distribution of node degrees Distribution of cliques (e.g., triangles) Distribution of paths Diameter (max shortest- path) Effective diameter (90 th percentile) Mean diameter Connected components In a big Erdos-Renyi graph this is small (logn/logz) In social graphs, it is also small (6 degrees)
  • Slide 11
  • Graphs Some common properties of graphs: Distribution of node degrees Distribution of cliques (e.g., triangles) Distribution of paths Diameter (max shortest- path) Effective diameter (90 th percentile) Mean diameter Connected components In a big Erdos-Renyi graph there is one giant connected component because two giant connected components cannot co-exist for long.
  • Slide 12
  • Slide 13
  • n/a Poor fit
  • Slide 14
  • More terms Centrality and betweenness: how does your position in a network affect what you do and how you do it? And how can we define these precisely? High centrality: ringleaders? High betweenness: go-between, conduit between different groups? Structural hole Group cohesiveness: number of edges within a (sub)group
  • Slide 15
  • More terms
  • Slide 16
  • Slide 17
  • Association network: bipartite network where nodes are people or organizations
  • Slide 18
  • A larger association network
  • Slide 19
  • Triads and clustering coefficients In a random Erdos-Renyi graph: In natural graphs two of your mutual friends might well be friends: Like you they are both in the same class (club, field of CS, ) You introduced them
  • Slide 20
  • Watts-Strogatz model Start with a ring Connect each node to k nearest neighbors homophily Add some random shortcuts from one point to another small diameter Degree distribution not scale-free Generalizes to d dimensions
  • Slide 21
  • Slide 22
  • Even more terms Homophily: tendency for connected nodes to have similar properties Social contagion: connected nodes become similar over time Associative sorting: similar nodes tend to connect Disassociative sorting: vice-versa Association network: bipartite network where nodes are people or organizations
  • Slide 23
  • A big question Homophily: similar nodes ~= connected nodes Which is cause and which is effect? Do birds of a feather flock together? Do you change your behavior based on the behavior of your peers? Do both happen in different graphs? Can there be a combination of associative sorting and social contagion in the same graph?
  • Slide 24
  • Slide 25
  • Slide 26
  • Slide 27
  • Slide 28
  • A big question about homophily Which is cause and which is effect? Do birds of a feather flock together? Do you change your behavior based on the behavior of your peers? How can you tell? Look at when links are added and see what patterns emerge (triadic closure): Pr(new link btwn u and v | #common friends)
  • Slide 29
  • Slide 30
  • T(k) = 1 (1-p)^k T(k) = 1 (1-p)^(k-1) Triadic closure
  • Slide 31
  • Changing behavior
  • Slide 32
  • Final example: spatial segregation How picky do people have to be about their neighbors for homophily to arise? Imagine a grid world where Agents are red or blue Agents move to a random location if they are unhappy Agents are happy unless