Introduction to Networks and
Business Intelligence
Prof. Dr. Daning Hu
Department of Informatics
University of Zurich
Sep 22th, 2016
2
Outline Network Science
A “Random” History
Network Analysis
Network Topological Analysis: Random, Scale-Free, and Small-world
Networks
Node level analysis
Link Analysis
Network Visualization
Network-based Business Intelligence Application
* Some of the contents are adapted from Prof. James Moddy’s slides at Duke University, and Prof.
Jure Leskovec and Lada Adamic from Standford University
3
Networks are Everywhere
* Some of the contents are adapted from Prof. James Moddy’s slides at Duke University, and Prof.
Jure Leskovec and Lada Adamic from Standford University
4
Networks are Everywhere
* Some of the contents are adapted from Prof. James Moddy’s slides at Duke University, and Prof.
Jure Leskovec and Lada Adamic from Standford University
5
Networks are Everywhere
* Some of the contents are adapted from Prof. James Moddy’s slides at Duke University, and Prof.
Jure Leskovec and Lada Adamic from Standford University
Networks and Complex Systems
Behind various complex systems there are intricate wiring
networks, which defines the interactions and exchanges among
the nodes/individuals/entities.
Society
Brain
Communities
Organizations
We need to understand the networks first in order to really
understand how these complex systems work.
7
Network Science
Network science is an interdisciplinary academic field which
studies complex networks such as information networks,
biological networks, cognitive and semantic networks, and
social networks. It draws on theories and methods including:
Graph theory from mathematics, e.g., Small-world
Statistical mechanics from physics, e.g., Rich get richer,
Data mining and information visualization from computer science,
Inferential modeling from statistics, e.g., Collaborative filtering
Social structure from sociology, e.g., weak tie, structural holes
network science can be defined as "the study of network
representations of physical, biological, and social phenomena leading
to predictive models of these phenomena.”
8
A “Random” History: Math, Psychology, Sociology…
The study of networks has emerged in diverse disciplines as a
means of analyzing complex relational data.
Network science has its root in Graph Theory.
Seven Bridges of Königsberg written by Leonhard Euler in 1736.
Vertices, Edges, Nodes, Links,
a branch of mathematics that studies the properties of pairwise relations in a
network structure
Social Network Analysis
Jacob Moreno, a psychologist, developed the Sociogram and to “precisely
describe the interpersonal structure of a group”.
Jacob’s experiment is the first to use Social Network Analysis and was
published in the New York Times (April 3, 1933, page 17).
Stanley Milgram (Small World Experiment: Six Degrees of Separation,
1960s). Facebook: 5.28 steps in 2008, 4.74 in 2011.
We Live in a Connected World
“To speak of social life is to speak of the association between
people – their associating in work and in play, in love and in war, to
trade or to worship, to help or to hinder. It is in the social relations
men establish that their interests find expression and their desires
become realized.”Peter M. Blau
Exchange and Power in Social Life, 1964
"If we ever get to the point of charting a whole city or a whole nation,
we would have … a picture of a vast solar system of intangible
structures, powerfully influencing conduct, as gravitation does in
space. Such an invisible structure underlies society and has its
influence in determining the conduct of society as a whole."
J.L. Moreno, New York Times, April 13, 1933
“For the last thirty years, empirical social research has been
dominated by the sample survey. But as usually practiced, …,
the survey is a sociological meat grinder, tearing the individual
from his social context and guaranteeing that nobody in the
study interacts with anyone else in it.”Allen Barton, 1968 (Quoted in Freeman 2004)
Moreover, the complexity of the relational world makes it
impossible to identify social connectivity using only our intuition.
Social Network Analysis (SNA) provides a set of tools to
empirically extend our theoretical intuition of the patterns that
compose social structure.
The Origin of Modern Network Science:
Social Network Analysis
Social network analysis (SNA) is a set of relational methods for
systematically understanding and identifying
connections/ties/relationships among actors.
Social network analysis (SNA)
is motivated by a structural intuition based on ties linking social actors
is grounded in systematic empirical data
draws heavily on graphic imagery
relies on the use of mathematical and/or computational models.
Social Network analysis lets us answer questions about social interdependence. These
include:
“Networks as Variables” approaches
• Are kids with smoking peers more likely to smoke themselves?
• Do unpopular kids get in more trouble than popular kids?
• Do central actors control resources?
“Networks as Structures” approaches
• What generates hierarchy in social relations?
• What network patterns spread diseases most quickly?
• How do role sets evolve out of consistent relational activity?
We don’t want to draw this line too sharply: emergent role positions can affect
individual outcomes in a ‘variable’ way, and variable approaches constrain
relational activity.
What Does Social Network Analysis Study?
14
Now…
Nodes Links
Social network People Friendship, kinship, collaboration
Inter-organizational
network
Companies Strategic alliance, buyer-seller
relation, joint venture
Citation network Documents/authors Citations
Internet Routers/computers Wire, cable
WWW Web pages hyperlink
Biochemical network Genes/proteins Regulatory effect
… … …
Complex Networks in the Real World
15
Examples of Real-World Complex Networks
A collaboration network of
physicists (size < 1K)
Source: (Newman & Girvan,
2004)
The Internet
(size > 150K),
Source: Lumeta Corp.,
The Internet Mapping
Project
16
Now…
Universal modeling and analysis methods for complex network
data
Shared vocabulary between fields: Computer Science, Physics,
Sociology, Economics, Statistics, Biology
“Big” Data availability: Internet, mobile, bio, health, security…
Impact/usage: social networking, social media, marketing, etc.
Network Representations Social Network data consists of two linked classes of data:
Nodes: Information on the individuals (actors, nodes, points, vertices)
Network nodes are most often people, but can be any other unit capable of
being linked to another (schools, countries, organizations, etc.)
The information about nodes is what we usually collect in standard social
science research: demographics, attitudes, behaviors, etc.
Graph theory notation: G(V,E), E: Edges, Links, Ties, Relations,…
18
Network Representations
In general, a relation can be: (1) Binary or Valued (2) Directed or
Undirected
a
b
c e
d
Undirected, binaryDirected, binary
a
b
c e
d
a
b
c e
d
Undirected, ValuedDirected, Valued
a
b
c e
d
1 3
4
21
19
Network Analysis: Topology/Structural Analysis
Network Topology Analysis provides an analytic understanding
of social structures and supplement individualistic methods.
Network topological measures include:
Size,
Density,
Average Degree,
Average Path Length: on average, the number of steps it takes to get
from one member of the network to another.
Diameter
Clustering Coefficient: a measure of an "all-my-friends-know-each-
other" property; small-world feature
1
)(
)1(
2)(
i
ii
i
iCoeffClusteringCC
kk
EiCC
ki = Cd(i) = # of neighbors of node i
Ei = # of links actually exist between ki nodes
20
Topology Analysis: Three Topology Models
Random Network
Erdős–Rényi Random Graph model
used for generating random graphs in which edges are set between nodes
with equal probabilities
21
Topology Analysis: Three Topology Models
Small-World Network
Watts-Strogatz Small World Model
used for generating graphs with small-world properties
large clustering coefficients
22
Topology Analysis: Three Topology Models
Scale-Free Network
Barabási–Albert (BA) Preferential Attachment Model
A network model used to demonstrate a preferential attachment or a "rich-
get-richer" effect.
an edge is most likely to attach to
nodes with higher degrees.
Power-law degree distribution
23
Network Analysis: Topology Analysis
Topology Average Path Length
(L)
Clustering
Coefficient (CC)
Degree Distribution
(P(k))
Random Graph Poisson Dist.:
Small World
(Watts & Strogatz, 1998)
Lsw Lrand CCsw CCrandSimilar to random
graph
Scale-Free network LSF Lrand Power-law
Distribution:
P(k) ~ k-
k
NLrand
ln
ln~
N
kCCrand
!)(
k
kekP
kk
k : Average degree
24
Network Scientists
• Paul Erdős (Random graph model)
• Duncan Watts (Small-World model)
• A.-L. Barabási (Scale-Free model); “Linked”
• Mark Newman (SW and SF models)
25
Network Analysis: Node-level Analysis
Node Centrality can be viewed as a measure of influence or
importance of nodes in a network.
Degree
the number of links that a node possesses in a network. In a directed
network, one must differentiate between in-links and out-links by
calculating in-degree and out-degree.
Betweeness
the number of shortest paths in a network that traverse through that node.
Closeness
the average distance that each node is from all other nodes in the network
26
Example: Centrality Measures of Bin Laden in a
Global Terrorist Network
0
10
20
30
40
50
60
1989
1990
1991
1992
1993
1994
1995
1996
1997
1998
1999
2000
2001
2002
Degree
0
500
1000
1500
2000
2500
3000
3500
4000
4500
5000
1989
1990
1991
1992
1993
1994
1995
1996
1997
1998
1999
2000
2001
2002
Betweenness
0
50
100
150
200
250
300
350
400
1989
1990
1991
1992
1993
1994
1995
1996
1997
1998
1999
2000
2001
2002
Closeness
The changes in the degree,
betweenness and closeness of
the node bin Laden from 1989
to 2002
27
Findings and Possible Explanations
The changes described in the above figure show that From 1994 to 1996, bin Laden’s betweenness decreased a lot and then increased
until 2001
In 1994, The Saudi government revoked his citizenship and expelled him from
the country
In 1995, he then went to Khartoum, Sudan, but under U.S. pressure was
expelled Again
In 1996, bin Laden returned to Afghanistan established camps and refuge
there
From 1998 to 1999, there is another sharp decrease in betweenness
After 1998 bombings of the United States embassies around world, President
Bill Clinton ordered a freeze on assets linked to bin Laden
Since then, bin Laden was officially listed as one of the FBI Ten Most Wanted
Fugitives and FBI Most Wanted Terrorists
In August 1998, the U.S. military launched an assassination but failed to harm
bin Laden but killed 19 other people
In 1999, United States convinced the United Nations to impose sanctions
against Afghanistan in an attempt to force the Taliban to extradite him
28
Network Analysis: Link Analysis Link analysis focuses on the prediction of link formations
between a pair of nodes based on various network factors. Its
applications include:
Finance: Insurance fraud detections
E-commerce: recommendation systems, e.g., Amazon
Internet Search Engine: Google PageRank
Law Enforcement: Crime link predictions
29
Network Visualization: Expert Partition of the
Collaboration Network
Weapons of
massive
destruction
Terrorism in
Europe
Criminal
justice
An international
terrorism conf.
Rand Corp.
Historical and policy
perspective of
terrorism
Not well-defined
group
Legal
perspective of
terrorism