Upload
hubert-lo
View
120
Download
3
Embed Size (px)
Citation preview
Social Networks Analytics
Hubert Lo Prateek Maitra Aaron Strahl
Wikipedia Vote Network
Outline- Introduction- Wikipedia Request for Adminship- Is the RfA process fair?- Application Techniques
- Descriptive Statistics- Distributions, Betweenness, Clustering
- Graph Partitioning- Key Takeaways- Conclusion
Background
Background About RfA and its process:
Nomination
Notice of RfA
Expressing Opinions
Discussion, decision, and closing procedures
Research Question Question: We were interested to analyze the directed graph
relationship between wikipedia administrator users and
average users in a Wikipedia voting dataset.
Are the procedures in place fair or not?
Application TechniquesTechniques:
Descriptive Statistics and Interpretation
Graph Partitioning/Visuals
Filtering the network by increasing degree - Gephi
Network Degree Distribution
Pattern of Random or Preferential Attachment?
Descriptive Statistics Edges Count: 103,689 Strongly Connected - False
Vertices Count: 7,115 Global Clustering: 0.1254791
Reciprocity: 0.0564 Weakly Connected - False
Average Path: 3.34 Diameter: 10
Degree Distribution• The Long Tail
Distribution is very
evident
• Nodes from 0 to 100
degrees account for
about 85% of the all the
nodes in the dataset
Degree Distribution• Few hubs with large
number of links.
• Many nodes with less
number of links.
Log-Log Plot
• Quantity being
measured can be
viewed as a type of
popularity
• Rich-get-Richer
Phenomenon
Average Betweenness and Degree
• Degree Centrality and
Node Betweenness appear
very linear
• Nodes with higher degree
of connections have
higher betweenness
scores
Average Clustering and Degree• local clustering appears
to be decreasing
exponentially as degree
centrality increases,
resembling the power law
phenomenon
• Moderate levels of
degree centrality, still
high clustering levels
Average Constraint and Degree• Average constraint
embeddedness and degree
centrality have a
negative linear
relationship.
• Majority of users have
relatively low level of
constraint.
Average Neighbor Degree and Degree Plot
• Low level degree
users have wide,
their neighbors
have higher average
degree.
• As we increase
degree, in
comparison their
neighbors have
lower degree
connections.
Application Techniques - Partitioning Challenge in How to partition the graph?We have a network
that has a lot of edges, very dense.
Nodes:7,066
Edges:103,663
Graph Networks - Partitioning We increased the degree over time to see how the network
evolved
Degree: Range 2 to 1,167.
Nodes:4797 (67.42%)
Edges:101394(97.97%)
Graph Networks - Increasing Degree
Graph Networks - Partitioning
Degree Range 160 to 1,167.
Nodes:262 (3.68%)
Edges:9,959(9.60%)
Graph Networks - Partitioning
Degree Range 260 to 1,167.
Nodes:92(1.29%)
Edges:2,098(2.02%)
Core Component• Majority of these nodes have very high betweenness scores.
• Majority of these nodes have high eigenvector centrality.
• They belong to the strongly connected component id:1016.
Key Takeaways- RfA process for adding new administrators does not
exhibit weak or strong connectivity
- Network structure is directed toward a dense, central
core with a lot of nodes around the periphery
- Rich-get-richer/Preferential attachment model
characteristics are exhibited
- Although every vote counts the same, an Administrator’s
vote has the potential to bring many more votes along
with it
- Graph partitioning allows us to view the core clearly
So is it Fair?- Ultimately, we determined that the Wikipedia Rfa process
is fair but highly flawed, with underlying nuances
- Although a new user’s vote and an administrator’s vote
technically carries the same weight, administrators
leverage the power of their personal network
- As a result, current administrators retain control over
the network as a whole and decide who gets to become an
administrator
Questions?