Upload
erica-mclaughlin
View
216
Download
0
Embed Size (px)
Citation preview
The Science of Networks 1.1
Welcome!
CompSci 96: The Science of NetworksSocSci 119
M,W 1:15-2:30
Professor: Jeffrey Forbes
http://www.cs.duke.edu/courses/spring11/cps096
The Science of Networks 1.2
Today’s topics
What is a network? Why are they important?
The Oracle of Bacon Network construction
Acknowledgements Notes taken from Michael Kearns ,Lada
Adamic, and Nicole Immorlica
Upcoming Network Structure: Graph Theory GUESS
The Science of Networks 1.3
Course Information
Grading Breakdown No background assumed,
but we will Interpret and work
with models both quantitatively and qualitatively
Important Dates Midterm 2/23 Projects due 4/21 Final 5/5 9am-Noon
Let me know ASAP if you have any concerns
“The structure and interconnectivity of social, technological, and natural networks. Network structure: graph theory, economic, social, physical, and natural networks. Network behavior: game theory, markets and strategic interaction, aggregate and emergent functions, and dynamics. Information networks: search and integration.
Applications in sociology, economics, public policy, and computing..”
Assessment Weight (approx)
Assignments (5)
30%
Blog Posts (3)
15%
Classwork/Community
15%
Midterm 15%
Final 25%
The Science of Networks 1.4
A Future for Computer Science?
The Science of Networks 1.5
Emerging science of networks Examining apparent similarities between many
human and technological systems & organizations Importance of network effects in such systems
How things are connected matters greatly Structure, asymmetry and heterogeneity
Details of interaction matter greatly The metaphor of viral spread Dynamics of economic and strategic
interaction Qualitative and quantitative; can be very
subtle A revolution of
measurement theory breadth of vision
(M. Kearns)
The Science of Networks 1.6
What is a network?
A collection of individual or atomic entities
Links can represent any pairwise relationship Links can be directed or undirected
Network: entire collection of nodes and links might sometimes be annotated by other info
(weights, etc.) For us, a network is an abstract object
(list of pairs) and is separate from its visual layout that is, we will be interested in properties that
are layout-invariant We will be interested in properties of
networks often structural properties often statistical properties of families of
networks
The Science of Networks 1.7
Repesenting networks Networks are collections of points joined by
lines. What kinds of questions might we ask?
“Network” ≡ “Graph”
points lines
vertices edges, arcs math
nodes links computer science
sites bonds physics
actors ties, relations sociology
node
edge
The Science of Networks 1.8
Definitions
Path: a sequence of nodes (v1, …, vk) such that for any adjacent pair vi and vi+1, there’s an edge ei,i+1 between them.
Distance: the length of the shortest path between two nodes
Diameter: the maximum shortest-path distance between any two nodes
2
8
3
7
4
5
6
1
The Science of Networks 1.9
Network Definitions
Network size: total number of vertices (denoted n) Maximum possible number of edges (m)?
If the distance between all pairs is finite, we say the network is connected; else it has multiple components
Attributes of edges Weight or cost Direction
Degree of a node v = number of edges connected to v Directed versions (in-degree and out-degree)
What else might we want to model beyond just the connections?
The Science of Networks 1.10
Issues
Why model networks? Structure & dynamics Models (structure): who is linked to whom?
• How does position within a network (dis)advantage an agent?
• What are the factors that lead people to trust each other?
• Graph theoretic models Implications (dynamics): individual behavior
can have global consequences• Diffusion of disease and information• Search by navigating the network• Resilience• Population, structural, and aggregate effects• Game theoretic models
The Science of Networks 1.11
Social networks Example: Acquaintanceship networks
vertices: people in the world links: have met in person and know last names hard to measure
Example: scientific collaboration vertices: math and computer science researchers links: between coauthors on a published paper Erdos numbers : distance to Paul Erdos Erdos was definitely a hub or connector; had 507
coauthors How do we navigate in such networks?
The Science of Networks 1.12
Acquaintanceship & more
The Science of Networks 1.13
Six Degrees of Bacon Background
Stanley Milgram’s Six Degrees of Separation? Craig Fass, Mike Ginelli, and Brian Turtle invented it
as a drinking game at Albright College Brett Tjaden, Glenn Wasson, Patrick Reynolds have
run t online website from UVa and beyond Instance of Small-World phenomenon
http://oracleofbacon.org handles 2 kinds of requests1. Find the links from Actor A to Actor B. 2. How good a center is a given actor? How does it answer these requests?
The Science of Networks 1.14
How does the Oracle work? Not using Oracle™ Queries require traversal of the graph
BN = 0 Mystic River
Apollo 13
Footloose
John Lithgow
Sarah Jessica Parker
Bill Paxton
Tom Hanks
Sean Penn
Tim Robbins
BN = 1
Kevin Bacon
The Science of Networks 1.15
How does the Oracle Work?
Kevin Bacon
Mystic River
Apollo 13
Footloose
John Lithgow
Sarah Jessica Parker
Bill Paxton
Tom Hanks
Sean Penn
Tim Robbins
BN = 0
BN = 1Sweet and Lowdown
Fast Times at Ridgemont High
War of the Worlds
The Shawshank Redemption
Cast Away
Forrest Gump
Tombstone
A Simple Plan
Morgan Freeman
Sally Field
Helen Hunt
Val Kilmer
Miranda Otto
Judge Reinhold
Woody Allen
Billy Bob Thornton
BN = 2
BN = Bacon Number Queries require traversal of the graph
The Science of Networks 1.16
How does the Oracle work?
Mystic River
Footloose
John Lithgow
Sarah Jessica Parker
Tom Hanks
Sean Penn
Tim Robbins
BN = 0
BN = 1Sweet and Lowdown
Fast Times at Ridgemont High
War of the Worlds
The Shawshank Redemption
Cast Away
Forrest Gump
A Simple Plan
Morgan Freeman
Sally Field
Helen Hunt
Miranda Otto
Judge Reinhold
Woody Allen
Billy Bob Thornton
BN = 2
Bill Paxton
Tombstone
Val Kilmer
Apollo 13Kevin Bacon
How do we choose which movie or actor to explore next?
Queries require traversal of the graph
The Science of Networks 1.17
Center of the Hollywood Universe? 1,246,221 people can be connected to
Bacon Is he the center of the Hollywood
Universe? Who is? Who are other good centers? What makes them good centers?
Centrality Closeness: the inverse average distance of a
node to all other nodes Degree: the degree of a node Betweenness: a measure of how much a vertex
is between other nodes
The Science of Networks 1.18
Oracle of Bacon
Name someone who is 4 degrees or more away from Kevin Bacon1 42 53 6
What characteristics makes someone farther away?
What makes someone a good center? Is Kevin Bacon a good center?
The Science of Networks 1.19
Sample Blog Post
I'm Related to Kevin Bacon? Overview of the Oracle of Bacon:In class we have talked a
lot about social and computer networks and all of their component parts. We have learned many important aspects of networks and what makes them operate. One of the most interesting and complex notions is that of centrality and how one can go about calculating centrality within a social network. The Oracle of Bacon is one of the best examples of a project that has created an elaborate social network around the central figure of Kevin Bacon. However, it is interesting that the site proves Kevin Bacon to actually not be the center of the Hollywood network, in fact there are actually 1,048 actors who would make better centers than Bacon. Here is a breakdown of the best and worst centers of the Hollywood network. Although the only other actor mentioned who would make a better center is Sean Connery, it can be speculated as to what makes a great center. A good center would have to be an older actor, have appeared in many movies and many varities of movies, have appeared in large productions with many actors and have worked overseas. Alternatively, a bad center would be young, have appeared in only one type of movie, or one movie in general!
The Science of Networks 1.20
Why is the Oracle of Bacon Interesting to us?• In reality, the game is an example of the small world
phenomenon. The small world phenomenon was researched by Stanley Milgram as he examined the average path length for social networks of people in the United States. The phenomenon shows that paths between nodes are always shorter than expected, which is proved in the game. This oracle of Bacon game was designed by computer scientists at the University of Virginia in order to create an engaging way of dealing with the small world phenomenon. The program for calculating a Bacon number was developed by mapping networks from http://imdb.com/ (the database for movies and actors information).
Other related points• Here is the original paper by Stanley Milgram, upon
which all of this information is based. The game works to find links between different actors and find the degree of separation from Bacon. It is amazing that almost any actor, no matter how obscure, can be linked to Bacon within six degrees and the average is under three links (2.960).
• It is also interesting to look at the earlier examples of small world phenomenon, which inspired the oracle of Bacon. Erdos numbers refer to the number of nodes mathematicians are away from Paul Erdos, a Hungarian mathematician famous for collaboration. The Erdos number project gives details similar to the Oracle of Bacon about the amount of connectivity within the network of mathematicians. In this network the median Erdos number is 5; the mean is 4.65, and the standard deviation is 1.21. This shows that there is slightly less connectivity, but a high degree of centrality.
The Science of Networks 1.21
Here is a visualization of the Erdos Network.
More recent centrality work• There are many examples of computer scientists who
have dealt with the six degrees theory in their analysis of the small-world phenomenon including Jon Kleinberg. His paper: Could it be a Big World After All? The `Six Degrees of Separation’ Myth. Society, April 2002 deals with a lot of the important ideas discussed above. Kleinberg argues that the initial data used to create the notion of the small-world phenomenon was actually skewed and data shows that there might actually be less connectivity between people that was previously believed. This paper was published in 2002, and it does not seem to have garnered a large amount of debate amongst the scholarly community. It seems that more work and experimentation needs to be done in this field to in attempt to make claims about the connectedness of the actual world. Although Kleinberg and others made some really interesting points initially, unfortunately the computer science world seems focused on novelty, not finishing work on a phenomenon, so it may be awhile before all of our questions are answered!
The Science of Networks 1.22
Physical Networks The Internet
Vertices: Routers Edges: Physical connections
Another layer of abstraction Vertices: Autonomous systems Edges: peering agreements Both a physical and business network
Other examples US Power Grid Interdependence and August 2003 blackout
The Science of Networks 1.23
What does the Internet look like?
The Science of Networks 1.24
US Power Grid
The Science of Networks 1.25
Business & Economic Networks Example: eBay bidding
vertices: eBay users links: represent bidder-seller or buyer-seller fraud detection: bidding rings
Example: corporate boards vertices: corporations links: between companies that share a board
member Example: corporate partnerships
vertices: corporations links: represent formal joint ventures
Example: goods exchange networks vertices: buyers and sellers of commodities links: represent “permissible” transactions
The Science of Networks 1.26
Enron
The Science of Networks 1.27
Content Networks
Example: Document similarity Vertices: documents on web Edges: Weights defined by similarity See TouchGraph GoogleBrowser
Conceptual network: thesaurus Vertices: words Edges: synonym relationships
The Science of Networks 1.28
Wordnet
Source: http://wordnet.princeton.edu/man/wnlicens.7WN
The Science of Networks 1.29
Biological Networks
Example: the human brain Vertices: neuronal cells Edges: axons connecting cells links carry action potentials computation: threshold behavior N ~ 100 billion
The Science of Networks 1.30
Gene regulatory networks Humans have only 30,000 genes, 98% shared with
chimps The complexity is in the interaction of genes Can we predict what result of the inhibition of
one gene will be?
Source: http://www.zaik.uni-koeln.de/bioinformatik/regulatorynets.html.en
The Science of Networks 1.31
Types of networks Pick a class of network: Give a real-world example of such a
network: What are the vertices (nodes)?
What are the edges (links)?
How is the network formed? Is it decentralized or centralized? Is the communication or interaction local or global?
What is the network's topology? For example, is it connected? What is its size? What is the degree distribution?
The Science of Networks 1.32
Graph properties
Max Degree?
Center?
The Science of Networks 1.33
Wrap up
Networks are everywhere and can be used to describe many, many systems.
By modeling networks, we can start to understand their properties and the implications those properties have for processes occurring on the network