2. Outline: Why Network Topology is Important ? Modeling Internet Topology Complex Networks Scale-free Networks Power-laws of the Web Search in power-law networks: GNUTELLA, a P2P example.
3. Why Topology is Important ? Design Efficient Protocols Solve Internetworking Problems: - routing - resource reservation - administration Create Accurate Model for Simulation Derive Estimates for Topological Parameters Study Fault Tolerance and Anti-Attack Properties
4. Modeling Internet Topology : Graph representation Router-level modeling - vertices are routers -edges are one-hop IP connectivity Domain- (AS-) level model (high degree of abstraction) - vertices are domains (ASes) - edges are peering relationships Nodes can be assigned numbers rep. e.g. buffer capacity Edges migth have weights rep. e.g. prop. delay, bandwidth capacity.
5. Modeling Internet Topology : transit domains domains/autonomous systems exchange point border routers peering hosts/endsystems routers stub domains access networks lowly worm
6. Barabasi Albert Model (BA Model): Basis for most current topology generators Very simplistic model Network evolves in size over time. Preferential Connectivity Probability that a newly added node will attach to node i ki i ) = (k j jk Many extensions.
7. Waxman Model: Router level model Nodes placed at random in 2D space with dimension L Probability of edge (u,v): a*e(-d / (bL) ), where d is Euclidean distance (u,v), a and b are constants Models locality - no sense of backbone or hierarchy - does not guarantee connected network - as #nodes the #links proportionally u d(u,v) v
8. Transit-Stub Model: Router level model Transit domains placed in 2D space populated with routers connected to each other Stub domains placed in 2D space populated with routers connected to transit domains Models hierarchy Edge count, guaranteed connectivity
9. Transit-Stub Model: No concept of a host all nodes are routers. Two level hierarchy First generate a number of transit domains, then generate a set of stub networks. Given average edge-count, produce a random graph, making sure that it is connected.
10. Inet: Generate degree sequence Build spanning tree over nodes with degree larger than 1, using preferential connectivity randomly select node u not in tree join u to existing node v with probability d(v)/d(w) Connect degree 1 nodes using preferential connectivity Add remaining edges using preferential connectivity
11. BRITE: Generate small backbone, with nodes placed: randomly or concentrated (skewed) Add nodes one at a time (incremental growth) New node has constant # of edges connected using: preferential connectivity and/or locality
12. Complex Networks: Two limiting-case topologies have been extensively considered in the literature ,.: regular network (lattice), the chosen topology of innumerable physical models such as the Ising model or percolation. random graph, studied in mathematics and used both in natural and social sciences. Properties studied in detail by Pal Erdos. Most of Erdos work concentrated on the case in which the number of vertices is kept constant but the total number of links between vertices increases: the Erds-Rnyi result states that for many important quantities there is a percolation-like transition at a specific value of the average number of links per vertex.
13. Complex Networks: random networks are used in: Physics: in studies of dynamical problems, spin models and thermodynamics, random walks, and quantum chaos. Economics and social sciences: to model interacting agents.
14. Complex Networks: In contrast to these two limiting topologies, empirical evidence suggests that many biological, technological or social networks appear to be somewhere in between these extremes. many real networks seem to share with regular networks the concept of neighborhood, which means that if vertices i and j are neighbors then they will have many common neighbors --- which is obviously not true for a random network. On the other hand, studies on epidemics show that it can take only a few ``steps'' on the network to reach a given vertex from any other vertex. This is the foremost property of random networks, which is not fulfilled by regular networks.
15. Complex Networks:
16. Complex Networks: The Watts-Strogatz model . : To bridge the two limiting cases, Watts and Strogatz [Nature 393, 440 (1998)] have introduced a new type of network which is obtained by randomizing a fraction p of the links of the regular network. Initial structure (p=0) is the one-dimensional regular network where each vertex is connected to its z nearest neighbors. For 0 < p < 1, we denote these networks disordered. for the case p=1, we have a completely random network.
17. Complex Networks: Watts and Strogatz report that for a small value of the parameter p, there is an onset of small-world behavior. It is characterized by the fact that the distance between any two vertices is of the order of that for a random network and, at the same time, the concept of neighborhood is preserved. The effect of a change in p is extremely nonlinear, where a very small change in the connectivity of the network leads to a dramatic change in the distance between different pairs of vertices.
18. Complex Networks: The scientific question we are trying to answer is: Does the onset of the small-world behavior occurs at a given value of p or does it occur for a value of the system size n which depends on p? To investigate this question, we need to look at the behavior of the system as a function of p for different values of n.
19. Complex Networks:
20. Complex Networks: The appearance of the small-world behavior is not a phasetransition but a crossover phenomena. The average distance l is: l(n,p)~n*F(n/n*) where: F(u1)~ln u, and n* is a function of p. When the average number of rewired links, pnz/2, is much less than one, the network should be in the large-world regime. On the other hand, when pnz/2>>1, the network should be a small-world.
21. Scale-free networks: It was proposed by Barabsi and Albert that real-world networks in general are scale-freenetworks. Scale-free networks have a distributionofconnectivitiesthat decayswithapower-lawtail. Scale-free networks emerge in the context of a growing network in which new vertices connect preferentially to the more highly connected vertices in the network. Scale free networks are also small-world networks because (i) they have clustering coefficients much larger than random networks, and (ii) their diameter increases logarithmically with the number of vertices n.
22. What are Power Laws ? Distribution that fits : P(k ) k Characteristic property of Scale free networks Occur very often in Complex Systems literature. Many complicated real world networks obey power laws
23. Implications of Power Laws: Majority of nodes have small connectivity. Few nodes have very large connectivity. Good resistance to random failure. Small resistance to planned attack. Could imply existence of some hierarchy (all real world power law networks support this). However, it is not clear whether PowerLawHierarchy
24. Origin of Power Law: Power laws are an observed (empirical) phenomenon. The mechanisms that produce these can only be guessed at (for now!) Very typical in self organizing systems and chaotic systems.
25. Scale-free networks: Scale-free networks: (a) the neuronal network of the worm C.elegans. (b) world-wide web. (c) the network of citations of scientific papers.
26. Scale-free networks: broad-scalenetworks: or truncated scale-free networks, characterized by a connectivity distribution that has a powerlaw regime followed by a sharp cut-off, like an exponential or Gaussian decay of the tail. single-scale networks: characterized by a connectivity distribution with a fast decaying tail, such as exponential or Gaussian Aging of the vertices: The vertex is still part of the network and contributing to network statistics, but it no longer receives links. The aging of the vertices thus limits the preferential attachment preventing a scale-free distribution of connectivities. Cost of adding links to the vertices or the limited capacity of a vertex: physical costs of adding links and limited capacity of a vertex will limit the number of possible links attaching to a given vertex.
27. Power-laws of the Web .: How many links on a page (outdegree)? How many links to a page (indegree)? Probability that a random page has k other pages pointing to it is ~k -2.1 (Power law) Probability that a random page points to k other pages is ~k -2.7 (Power law)
28. In-degree Distribution
29. Out-degree Distribution
30. Search in power-law networks: GNUTELLA . Most of the P2P networks display a power-law distribution in their node degree. This distribution reflects the existence of a few nodes with very high degree and many with low degree. In P2P networks, the name of the target file may be known, but due to the networks ad hoc nature, the node holding the file may not be known until a real-time search is performed. A simple strategy to locate files, implemented by NAPSTER, is to use a central server that contains an index of all the files every node is sharing as they join the network. GNUTELLA and FREENET do not use a central server.
31. Search in power-law networks: GNUTELLA . GNUTELLA is a peer-to-peer file-sharing system that treats all client nodes as functionally equivalent and lacks a central server that can store file location information. This is advantageous because it presents no central point of failure. The obvious disadvantage is that the location of files is unknown. When a user wants to download a file, he sends a query to all the nodes within a neighborhood of size ttl, the time to live assigned to the query. Every node passes on the query to all of its neighbors and decrements the ttl by one. In this way, all nodes within a given radius of the requesting node will be queried for the file, and those who have matching files will send back positive answers.
32. Search in power-law networks: GNUTELLA . This broadcast method will find the target file quickly, given that it is located within a radius of ttl. However, broadcasting is extremely costly in terms of bandwidth. Such a search strategy does not scale well. As query traffic increases linearly with the size of GNUTELLA graph, nodes become overloaded.
33. Search in power-law networks: GNUTELLA . Typically, a GNUTELLA client wishing to join the network must find the IP address of an initial node to connect to. Currently, ad hoclists of good GNUTELLA clients exist. It is reasonable to suppose that this ad hocmethod of growth would bias new nodes to connect preferentially to nodes that are already fairly well connected, since these nodes are more likely to be well known. Based on models of graph growth where the rich get richer, the power-law connectivity of ad hocpeer-to-peer networks may be a fairly general topological feature.
34. Search in power-law networks: GNUTELLA . By passing the query to every single node in the network, the GNUTELLA algorithm fails to take advantage of the connectivity distribution . To take advantage of the power-law distribution, we can modify each node to keep lists of files stored in first and second neighbor. Instead of passing the query to every node, now we can pass it only to the nodes with highest connectivity. High degree nodes are presumably high bandwidth node that can handle the query traffic.
35. Outline: Internet Structure &Organization Internet Hierarchical Structure ISPs, interconnection and organization [ref. 7]. POP Architecture and Load Balancing ISP Architecture [ref. 7]. in detail Topology Mapping Tool: Rocketfuel[ref. 8] Discussion ELEG 667-013 Spring 2003