17
International Journal of Computer Networks & Communications (IJCNC) Vol.4, No.3, May 2012 DOI : 10.5121/ijcnc.2012.4311 151 Deciding on the  T  y  p e of the Degree Distribution of a Graph f rom  Traceroute-lik e Measurements Xiaomin Wang 1 , Matthieu Latapy, Mich` ele Soria firstname. lastname@ lip6. fr LIP6 CNRS and UPMC, 4 place Jussieu, 75005 Paris, France Abstract The degree d ist r ibut ion of the Internet t o  polog  y is considered as one of it s main  pr o  per t ies.  How ever, it is only known t hr ough a measur ement  pr ocedur e which gives a biased est imat e. This measur ement may in  fir st  appr o  ximat ion be modeled by a BFS (  Br ead t h-F ir st Search) t r ee. We explore here our abilit  y to infer the t  y  pe (Poisson or power-law) of the degree d ist r ibut ion  from such a limit ed knowledge. We desi gn procedures which est imat e the degree d ist r ibut ion of a graph  from a BFS of it, and show e  x  per iment all  y (on models and r eal-wor ld data) that t his approach succeeds in making the difference bet wee n Poisson and  power-law degree d ist r ibut ions. Keywords: Internet, graph, topology , measurement, degree distribution, tracer- oute, simulations. 1. Introduction The Internet may be seen as a graph at several levels: autonomous systems, routers and links between them, or IP hops between interf aces for instance. For a decade, these graphs have been at the core of an intense research activity [10, 15, 23, 27] aimed at a better understanding and management of the Internet, which plays a crucial role in our society . A property of high interest in these graphs is the degree distribution, i.e. the f raction P k of nodes with k links, for all k: it may have a strong influence on the robustness of the network [2, 20], on protocol design [21], and on spreading of inf ormation [6, 23]. Moreover, it is of ten claimed that these degree distributions may deviate signifi cantly from what classical models assume [2, 13, 23, 27], which leads to an intense activity on modeling issues [12, 3]. However, the degree distribution of the Internet topology (at any of the levels cited above) is not readily available: one only has access to samples of these graphs, obtained through measurement procedures which are intricate, time and resource consuming, and far from complete. Even more important, these samples are biased

Deciding on the Type of the Degree Distribution of a Graph from Traceroute-like Measurements

Embed Size (px)

Citation preview

7/31/2019 Deciding on the Type of the Degree Distribution of a Graph from Traceroute-like Measurements

http://slidepdf.com/reader/full/deciding-on-the-type-of-the-degree-distribution-of-a-graph-from-traceroute-like 1/17

7/31/2019 Deciding on the Type of the Degree Distribution of a Graph from Traceroute-like Measurements

http://slidepdf.com/reader/full/deciding-on-the-type-of-the-degree-distribution-of-a-graph-from-traceroute-like 2/17

International Journal of Computer Networks & Communications (IJCNC) Vol.4, No.3, May 2012

152

by the m easurem ent procedure, which may have a st rong influence on the observeddegree d ist r ibu t ion [9, 1, 17, 27].

As a consequence, the curr ent sit ua t ion regarding the degree d ist r ibu t ion of theInternet is unclear [27, 19, 16, 18]. The relevance of ob t a ined samples regard - ing thedegree d ist r ibu t ion observed from t hem is far from being est ab lished. In p art icu lar , t hereis a cont roversy on whet her the Internet t op olog y may have a homogeneous (typ ica llyPoisson), or het eroge neous (t yp ically power-law) degree d ist r ibu t ion [16, 27]. In order toob t a in an answer to t h is quest ion, the m ost widely used approach curr ent ly is tocondu ct larger and larger m easurem ent s, in the exp ec t a t ion that t hese will lead toaccu ra t e observa t ions [18, 24, 8]. How- ever, t h is may be a dead end: the degreed ist r ibu t ion may be int r insically b iased by the m easurem ent process [1, 17] and inpract ice i t may depend much on t he sample size [18 ].

We explore here a com p let ely d iff erent approach: we consider a simple m odel of Internet t op olog y m easu rem ent s and try to derive the t ype of the degree d ist ribu t ionof the underlying graph from t h is lim it ed observa t ion. Our basic goal t heref ore is toanswer the following quest ion : given the lim it ed in f orm a t ion ob t a ined fromm easurem ent , does the underlying t op olog y more likely have a power-law or a Poissondegree d ist r ibu t ion?

In many cases (t r ace r ou t e m easurem ent s, BGP t ab les, and AS-level t r ace - r ou t e ,typ ica lly) , the m easurem ent process may be app roxim a t ed by a BF S (B read t h FirstSearch) t ree from a given node of the net work . Indeed, Int ernet m easu rem ent s m ost lyconsist in s et s of routes (i.e. paths in the considered t op olog y) going from am on it or to a set of t arget s, collec t ed from as m any m on it ors as possible. Since eachrou t e is modeled as a s hort est path and s ince one may exp ec t rou t es to have long commonprefixes, the view from each m on it or may be app roxim a t ed by a BFS. Alt hough t h is is arough app roxim at ion, in t he lack of a widely acce p t ed and better s olu t ion, it has been usedin many occ asions [17, 22, 1, 25, 9].

Deciding on the degree d ist r ibu t ion from such a view is a challenging t ask , and wewill make a few assum p t ions in order to make a first st ep t owards t h is amb it ious goal.We first assume that the size of the graph, i.e. it s number of nodes n, is given. In thecase of the Internet, t h is is a reasonable assump t ion [25, 14]. In add it ion, we willassume that the underlying graph is a rand om graph wi t h ei t her a Poisson or power-law degree d ist r ibu t ion. And finally, we assume that we have a complete BFS of theconsidered graph: all nodes (bu t not all links) in the graph are reached by theexp lora t ion. Finally, we assum e that the number of links m of the graph is known. It isclear that the last t wo assum p t ions are very s t rong, and are not attainable in pract ice .We however consider t hem as reasonable for a first app roxim a t ion, and give h int s of howt o ge t rid of the knowledge of the number of links in the last sec t ion of the pap er .

This paper cont a ins six main s ect ions. We describe our m et hodolog y in Sec t ion 2.

It relies on several s t ra t egie s to infer a degree d ist r ibu t ion f rom a BFS, which wedet a il in Sec t ion 3. In Sec t ions 4 and 5 we exper im ent ally eva lua t e the va lid it y of ourapproach on model graphs and real-world graph s resp ec t ively. In Sec t ion 6, we proposea m et hod to reduce the in it ia l in f orm a t ion needed by our approach. We discusslim it a t ions and f u t ure work in Sec t ion 7.

7/31/2019 Deciding on the Type of the Degree Distribution of a Graph from Traceroute-like Measurements

http://slidepdf.com/reader/full/deciding-on-the-type-of-the-degree-distribution-of-a-graph-from-traceroute-like 3/17

International Journal of Computer Networks & Communications (IJCNC) Vol.4, No.3, May 2012

153

Figure 1: Scheme of our m et h o d . G is an unknown graph on w h ic h we perform am ea su r em en t which gives its number of nodes n, it s number of links m and a BFS T . Wet h en consider t wo d iff e r en t h y p o t h eses: G has a Poisson degree d is t r ibu t ion w it h averaged eg r ee λ or it has a power- law degree d is t r ibu t ion w it h ex p on en t α . W e build t wo graphsG1 and G2 each w it h a st r a t egy in a cc o r d a n ce w it h the corresponding h y p ot h esis . We t h en compare the d eg r ee d is t r ibu t ion of G1 to the ex p ec t ed one of G if hy p ot h esis 1 is t r u e , andthe one of G2 to the ex p ec t ed one of G if h y p o t h esis 2 is t r u e . The h y p o t h esis which leads tothe m ost similar degree d is t r ibu t ion s is ex p ec t ed to be co rr ec t .

2. M e t h o d olog yOur m et hodolog y is s k et ched in Figure 1. It aims at deciding the t yp e of the

degree d ist r ibu t ion of an unknown graph G from one of it s BFS t ree T , it s number of nodes n and it s number of links m ob t ained t h rough a m easu rem ent . To do so, we

consider the two following hyp ot heses:

For each hyp ot hesis, we build a graph according to a s t ra t egy which we det a il inSec t ion 3, t hu s ob t a in ing G1 and G2, resp ec t ively. Our exp ec t a t ion is that if hyp ot hesis

H 1 is true (G is Poisson) t hen the degree d ist r ibu t ion P G1 of G1 will be closer to the

t heoret ica l d ist r ibu t ion P (1) than P G2 to P ( 2 ) , and conversely if H 2 is true (G ispower-law) t hen the degree d ist r ibu t ion P G2 of G2 will be closer to P (2) than P G1

to P ( 1) . To compare two d ist r ibu t ions, we will use d iff erent d ist ances D (we will

define several b elow ) and compare D(P ( 1) , P G1 ) to D(P ( 2) , P G2 ) .

7/31/2019 Deciding on the Type of the Degree Distribution of a Graph from Traceroute-like Measurements

http://slidepdf.com/reader/full/deciding-on-the-type-of-the-degree-distribution-of-a-graph-from-traceroute-like 4/17

International Journal of Computer Networks & Communications (IJCNC) Vol.4, No.3, May 2012

154

We exp er im en t a lly assess the va lid it y of t h is approach by applying it to cases where weknow the original graph G (we ob t ain such graphs using models in Sec- t ion 4 and usingreal-world data in Sec t ion 5). We t hen compare the expec t ed t heoret ica l degreed ist r ibu t ion to the ones of the graphs ob t a ined from each st ra t egy and check conformance of resu lt s wi t h exp ec t a t ions. Comparing d ist r i- bu t ions is a challenge it self ,for which no general au t om a t ic procedure is com - monly acce p t ed. To perform t h is, wewill use two co m p lem ent ary app roaches: visual insp ec t ion of p lot s of the inversecumu la t ive degree d ist r ibu t ions (IC DD ) , t oge t her wi t h t wo classical s t a t ist ica l t est s, theKolmogorov-Smirnov test [11 ] and St a t ist ica l Dist ance [5] d ist ances, defined as:

In Sec t ion 6, we ex t end our approach to get in f orm at ion on the degree d is- t ribu t ionof a graph when the number of links is not known, which is a m ore rea list ic assum p t ion. In that case we use our rebuilding s t rat egie s for a wide range of possible values of mand infer the m ost probable typ e of degree d ist r i- bu t ion as previously, see Figure 2.

3. Building s t r a t e gi e sSt art ing from a BFS T of a graph G wi t h n nodes and m links and an

hyp ot hesis (H1 or H2) on the degree d ist r ibu t ion of G (t yp e Poisson or p ower- law), our

obj ect ive here is to it era t ively add m − n + 1 links to T in order t o build a graph G

wit h n nodes, m links, and degree d ist r ibu t ion similar to t he one of G. We defined iff erent link add it ion s t ra t egie s according to the s upp osed t yp e of G, Poisson or power-law. And in each case, we show how to com pu t e the degree d ist r ibu t ion of theresu lt ing graph.

3.1. H1: G is Poisson (RR st r a t egy)

We may t h ink of building a graph G′ similar to G by using a variant of the ER

const ruct ion : starting wi t h the n nodes and n − 1 links of a BFS t ree T , the m − n + 1missing links are randomly added as in the ER model. Bu t then T may not be a

possible BFS of G′ : any link in G which is not in T is necessarily between t wo nodes

in consecu t ive levels of T , or in the same level of T (ot herwise T would not be a

7/31/2019 Deciding on the Type of the Degree Distribution of a Graph from Traceroute-like Measurements

http://slidepdf.com/reader/full/deciding-on-the-type-of-the-degree-distribution-of-a-graph-from-traceroute-like 5/17

International Journal of Computer Networks & Communications (IJCNC) Vol.4, No.3, May 2012

155

short est path t ree and t hu s not a BFS, see Figure 3). In order to ensure that T is also

a possible BFS of G′ we t heref ore add links only b etwee n nodes in consecu t ive levels or

in the same level. Since b ot h ext rem it ies of links are randomly chosen, we call t h is

const ruct ion the RR (Random-Random) s t ra t egy.

Figure 2: Scheme of our m et h o d , in the case when m is un k n ow n . G is an unknown graphon which we perform a m ea su r em en t w h ic h gives its number of nodes n and a BFS t r ee T .We t h en con si d e r t wo d iff e r en t t y p e of h y p o t h eses: (H1) G has a Poisson d eg r ee d is t r ibu t ion w it h average degree λ or (H2) it has a p ow e r -law d eg r ee d is t r ibu t ion w it h ex p on en t α . B o t h p a r a m et e r s λ and α a r e eq u iva le n t to the number of links m. Then we build t wo families of graphs G 1 , m and G 2 , m in accordance w it h t h ese t wo h y p o t h esis a nd d iff e r en t m. We t h encompare the degree d is t r ibu t ion of G 1 , m t o t h e ex p ec t ed one of G if (H1) is true, and theone of G 2 , m t o t h e ex p ec t ed one of G if (H2) is true. The hy p o t h esis which leads to t h e m ost similar degree d is t r ibu t ion s is ex p ec t ed to be c o rr ec t .

Figure 3: Le t us consider a part of a BFS of G, composed of a se t o f links A-B, A-c, B-D, C-Ea nd

C-F. Links B-E and D-E may also be p r ese n t in G, but n o t link A- F .

7/31/2019 Deciding on the Type of the Degree Distribution of a Graph from Traceroute-like Measurements

http://slidepdf.com/reader/full/deciding-on-the-type-of-the-degree-distribution-of-a-graph-from-traceroute-like 6/17

International Journal of Computer Networks & Communications (IJCNC) Vol.4, No.3, May 2012

156

We now show that the exp ect ed node degree d ist r ibu t ion of G′ ob t a ined wi t h

the RR s t ra t egy can be d irec t ly com pu t ed from n, m and T wi t hou t exp lici t ly

const ruct ing G′ .

Theorem 1. Given a tree T with n nodes, if we construct a graph G′ using the

RR strategy, then the expectation of a node v with degree l in G′ is:

As each newly added link im pact s t wo nodes, the values of P (k → l, j ) are

ind ep end ent from the ot her P (k ′ → l

′ , j′ ). We will consider that t h is im pact can be

ignored, when n and m is large enough. From t h is resu lt , one may est im a t e the

exp ec t a t ion of the degree d ist r ibu t ion of G′ from n , m and T , wi t hou t const ruct ing

it exp lici t ly. This is of high int erest in pract ice , since it allows to com pu t e the

exp ec t a t ion of degree dist r ibu t ion and compare it wi t h degree d ist r ibu t ions ob t a ined by the const ruct ing st ra t egie s. Going f ur t her would however need precise resu lt s on theexp ec t a t ion of degrees in T , which is a d ifficu lt problem [1].

7/31/2019 Deciding on the Type of the Degree Distribution of a Graph from Traceroute-like Measurements

http://slidepdf.com/reader/full/deciding-on-the-type-of-the-degree-distribution-of-a-graph-from-traceroute-like 7/17

International Journal of Computer Networks & Communications (IJCNC) Vol.4, No.3, May 2012

157

3.2. H2: G is power-law (PP st r a t egies )Suppose now that G is a power-law graph. We t heref ore aim at design ing a process

which builds from a BFS T of G, a graph G′ wi t h power-law node degree

d ist r ibu t ion. To do t h is, as before, we add m − n + 1 links b etween nodes in app ropr ia t e levels of T . However, t hese pairs of nodes are no longer chosen uniformly at random.Inst ead, we use a s elec t ion scheme inspired from the p ref - erent ia l attachment of theclassical Barab asi-Alb er t model [4]: we choose ( in the a pp rop r ia t e levels) nodesrandomly wi t h p robab ility p rop ort iona l to t heir degree in T . As we choose b ot h ext rem it ies of added links according to p ref - e rent ia l attachment, we call t h is procedurethe PP (P ref eren t ial-P ref erent ia l) st ra t egie s.

We now show how to com pu t e, for t h is s t rat egy, the exp ec t ed ob t ained degreed ist r ibu t ion.

Theorem 2. Given a tree T with n nodes, if we construct a graph G′ using the

PP strategy, then the expectation of a node v with degree l in G′ is:

′ where n j k is the number of nodes with degree k at level j in T and P (k → l, j, m′ ) is the

probability that a node v with degree k at level j in T is constructed a s a node withdegree l in G

′ after m′ links have been added into T .

Computation process.. The t erm P (k → l, j, m′ ) may be ob t a ined recursively:

We s p lit P (k → l, j, m′ ) int o t wo parts which correspond to t wo cases (linked to t h is

node or not) when a new link is added. The probab ilit y that a newl y added link changes the node from degree k to k + 1 is denot ed by θ .

In the following, we com pu t e θ using four t erm s:

• θ 1 : the p robab ilit y that the node v is s elec t ed as the first endp oin t of t he newlyadded link .

• θ 2 : the probab ilit y that a node at upper level j − 1 is s elec t ed as the firstendp oint , and v is s elec t ed as the second endp oint .

• θ 3 : the p robab ilit y that a node (excep t v) at level j is s elec t ed as the firstendp oint , and v is s elec t ed as the second endp oint .

• θ 4 : the probab ilit y that a node at lower level j + 1 is s elec t ed as thefirst endp oin t , and v is s elec t ed as the second endp oint .

7/31/2019 Deciding on the Type of the Degree Distribution of a Graph from Traceroute-like Measurements

http://slidepdf.com/reader/full/deciding-on-the-type-of-the-degree-distribution-of-a-graph-from-traceroute-like 8/17

International Journal of Computer Networks & Communications (IJCNC) Vol.4, No.3, May 2012

158

The sum of degrees of all nodes at the level i af t er t links have been add ed is di, t .In the following, we give the det a ils of how to ca lcu la t e for the case of PP s t ra t egy.

Using equa t ion (9) in the expressions (8) shows that d j , t is a f un ct ion of d j′

, t − 1 ,which can be calcu la t ed by dynamic programm ing t echn iques.

Our com pu t a t ion process is not exact , since we have neglec t ed to t ak e int o acco un tpossible collision, i.e. posit ions to be s elec t ed several t im es. Howe ver , since we dealwit h sparse graphs, in the case of model graphs as well as rea l graphs, the number of links to be added to the BFS t ree is much smaller t han the number of possibleposit ions, and so t here are very few collisions. From a pract ica l view p oint , in thebuilding process, we just ignore mu lt ip le link s.

4. V a li d a t io n using model g r a ph s

Our exp ec t a t ion is that the s t ra t egie s described in previous s ec t ion succee d in building agraph G

′ similar (regarding degree d ist r ibu t ion ) to G when t he app rop r iat e st rat egy is used wit h an app rop r ia t e graph (RR if G is Poisson, PP if G is power-law).

In add it ion, we exp ec t that the degree d ist r ibu t ion of G′ will differ s ign ificant ly from

that of G if a wrong s t ra t egy is applied (RR if G is power-law, PP if G is Poisson). Int h is s ec t ion we condu ct exp er im ent s on model graphs, i.e. random graphs in the classes of Poisson graphs or power-law graph s wit h given param et ers (average degree and

7/31/2019 Deciding on the Type of the Degree Distribution of a Graph from Traceroute-like Measurements

http://slidepdf.com/reader/full/deciding-on-the-type-of-the-degree-distribution-of-a-graph-from-traceroute-like 9/17

International Journal of Computer Networks & Communications (IJCNC) Vol.4, No.3, May 2012

159

P P

ICDD

ICD

exp onent resp ec t ively) . To ensure that the BFS covers all nodes of the graph, we use asof t ware which genera t es random simple connected graphs according to a given degreesequence (sam p led from the given degree d ist r ibu t ion) [26 ].

1 1G G

RR RR PP PP

0.8

0.8

0.6 0.6

0.6

0.4 0.4

0.4

0.2 0.2

0.2

00 2 4 6 8 10 12 14

degree

0

00 5 10 15 20 25

degree

Figure 4: R ec on st r u ct ion of a graph Poisson 3 and Poisson 10 . W e draw the ICDD (InverseC u m u la t ive Degree D is t r ibu t ion ) f o r t h r ee graphs: the original graph G and the ones ob t a in ed w it h the RR and the PP st r a t eg ies . H o r iz on t a l axis: degree k; ve r t ic a l a x is: f r a ct ion of thenumber of nodes w it h degree lower than or equal t o k .

7/31/2019 Deciding on the Type of the Degree Distribution of a Graph from Traceroute-like Measurements

http://slidepdf.com/reader/full/deciding-on-the-type-of-the-degree-distribution-of-a-graph-from-traceroute-like 10/17

International Journal of Computer Networks & Communications (IJCNC) Vol.4, No.3, May 2012

160

4.1. Poisson g r a ph s

In Figure 4 we present the resu lt s for Poisson graphs wit h average degrees 3 and 10 .The degree d ist r ibu t ion ob t a ined wi t h RR s t ra t egy is closer to the or iginal one, as

exp ec t ed. This is confirmed by the KS and SD s t a t ist ics (Table 1): t he sm allest values areob t a ined wi t h RR s t ra t egy.

Not ice that a Poisson graph wi t h a higher degree gives better resu lt s. T h is is probablydue to the f act that we add more links in t h is case, and so s t ra t egie s for doing t h is makemuch more d iff erence.

degree

Figure 5: Reconstruction of a graph power-law 2.1 and 2.3. We draw the ICDD for three graphs: the original graph G,and the ones obtained with the RR and the PP strategies.

.Finally, we conclude that our m et hod succeeds in recognizing random P ois- son graphs.

This is true for all average degrees, but performs best on graph s wit h a rela t ively highaverage degree .

4.2. Power-law gr a ph sSimilar to Poisson model graphs, we condu ct our exp er im ent s wi t h p ower- law model

graph s.In Figure 5 we present ob t a ined resu lt s for power-law graphs wit h exp onen t2.1 and 2.3. To better show the charact er ist ic of the power-law, all p lot s are in log-log scale.Bot h the ICDD p lo t and the statistic test (Table 2) s upp ort our conclusion for allexp onent s and all s izes.

7/31/2019 Deciding on the Type of the Degree Distribution of a Graph from Traceroute-like Measurements

http://slidepdf.com/reader/full/deciding-on-the-type-of-the-degree-distribution-of-a-graph-from-traceroute-like 11/17

International Journal of Computer Networks & Communications (IJCNC) Vol.4, No.3, May 2012

161

ICDD

Table 2: KS and SD for p ower -law model graph sPower-law 2.1 Power-law 2.3

n = 10000 n = 100000 n = 10000 n = 100000K S SD KS S D K S SD K S S D

RR 0.201 0 .432 0 .194 0 .405 RR 0.278 0 .591 0 .274 0 .553PP 0.038 0 .138 0 .049 0 .180 PP 0.030 0 .086 0 .024 0 .095

1 1G G

RR RR PP PP

Figure 6: ICDD of the Skitter and Radar graphs G and the ones obtained with RR and PPstrategies.

Table 3: KS and SD for Sk itt er graph.

K S SDt heory-RR 0.166 0 .359t heory-PP 0.082 0 .235

5. E x p e r im e n t s on real-world d a t aPrevious s ec t ion shows that our m et hod succeeds in making the d iff erence b et wee n

Poisson and power-law random graphs. It is clear however that, in p ract ice , consideredgraphs have neit her p er f ec t Poisson nor power-law degree d ist r ibu t ion, and are not rand om .

We consider in t h is sect ion several real-world da t aset s among the curr ent largestm easurem ent s of the Internet t op olog y. Alt hough ob t a ined graphs are st ill partial viewsand probably are s t rongly biased, t hey const it u t e curr ent st a t e-o f -t he-art of availabledata and we use t hem as b enchm ark s.

Like in previous s ect ion, for each case of real-world graph G, we consider t wo hyp ot heses: (H1) G has a degree d ist r ibu t ion close to a Poisson law; (H2) G has a

degree d ist r ibu t ion close to a power-law. Using s t ra t egie s RR and P P ,

7/31/2019 Deciding on the Type of the Degree Distribution of a Graph from Traceroute-like Measurements

http://slidepdf.com/reader/full/deciding-on-the-type-of-the-degree-distribution-of-a-graph-from-traceroute-like 12/17

International Journal of Computer Networks & Communications (IJCNC) Vol.4, No.3, May 2012

162

5.1. Skitter gr a phWe first try our m et hod on an AS-level map collec t ed by the Sk itt er pro jec t of

CAIDA [15]. The ob t a ined graph has 5776 nodes and 12025 link s.Figure 6 ( lef t ) shows the ICDD ob t a ined wi t h our s t ra t egie s and shows t ha t the ent ire Sk itt er graph follows a degree d ist r ibu t ion of typ e power-law (a p erf ec t power-law in alog-log scale is a straight line). The d ist r ibu t ion ob t a ined wi t h RR s t ra t egy is clearly farfrom a power-law. The one ob t a ined wi t h PP s t ra t egy is much closer. Table 3 confirmst h is, even t hough the difference b etwee n t he RR s t ra t egy and the PP s t ra t egy is asst rong as for model graphs (see Table 1 and 2) .

Finally, our m et hod succeeds in deciding that the Sk itt er graph has a degreed ist r ibu t ion close to a p ower-law.

Table 4: KS and SD for R ad a r graph s.

KS S D K S SD KS S DCm J ap on O r t olanRR 0.064 0 .151 RR 0.074 0 .202 RR 0.062 0 .156PP 0.241 0 .521 PP 0.163 0 .363 PP 0.213 0 .476

5.2. Radar gr a ph sA radar graph is a part of the Internet t op olog y observed by periodic runn ing

t racerou t e-li k e m easu rem ent s from one m on it or to a set of t arge t s during s evera l weeks,See [22] for det a ils and the original data. We use here t hree inst ances of radar graphs,from t h ree d iff erent m on it ors: Cm (21430 nodes and 37938 links), Japon (27066nodes and 77545 links) and Ort olan (24725 nodes and 48516 link s) .

In Figure 6 (right), we p lot the ICDD of Radar-Japon graph and t hose o b t a inedby the RR and the PP s t rat egie s. The shape of the original IC DD ind ica t es that thedegree d ist r ibu t ion of the underlying graph is likely to be a m ixt u re of b ot h Poisson andpower-law .

Table 4 shows numerical resu lt s for the t h ree radar graphs. All resu lt s show that thedifference between the RR s t ra t egy and it s corresponding t heoret ica l d ist r ibu t ion issmaller than that of the PP s t ra t egy. Therefore our m et hod decide that the t yp e of the radar graphs is more likely Poisson even t hough t heir d ist r ibu t ions have a long-t a il. Not e that the difference b etween RR and PP are much smaller than for previouscases, t hu s ind ica t ing that the con fidence in the conclusion is poo r .

Finally, our m et hod decides that Radar graphs are more likely to have P ois- son degreed ist r ibu t ion, but wit h poor confidence (which is in accordance wi t h t heir act ua l t yp e) .

6. Deciding wi t h o u t m

The goal of t h is sec t ion is to decide the t yp e of degree d ist r ibu t ion and est im a t e mfor an unknown graph G, using only n and one of it s com p let e BFS t ree T . Weproceed as follows: for each building s t ra t egy, we com pu t e the KS d ist ance b et ween

7/31/2019 Deciding on the Type of the Degree Distribution of a Graph from Traceroute-like Measurements

http://slidepdf.com/reader/full/deciding-on-the-type-of-the-degree-distribution-of-a-graph-from-traceroute-like 13/17

7/31/2019 Deciding on the Type of the Degree Distribution of a Graph from Traceroute-like Measurements

http://slidepdf.com/reader/full/deciding-on-the-type-of-the-degree-distribution-of-a-graph-from-traceroute-like 14/17

International Journal of Computer Networks & Communications (IJCNC) Vol.4, No.3, May 2012

164

Table 6: R esu lt s for power-law model graph s

n= 1000 K S m m ’ n= 10000 K S m m ’

power-law 2.10 0 .07 4233 3002 power-law 2.10 0 .06 42326 28129power-law 2.15 0 .09 3662 2936 power-law 2.15 0 .04 36622 25043power-law 2.20 0 .07 3220 2550 power-law 2.20 0 .03 32198 25043power-law 2.25 0 .07 2873 2418 power-law 2.25 0 .03 28730 21085power-law 2.30 0 .06 2598 2168 power-law 2.30 0 .03 25982 21377power-law 2.35 0 .05 2378 2200 power-law 2.35 0 .02 23780 20534power-law 2.40 0 .06 2200 2109 power-law 2.40 0 .02 21996 20274

Figure 8: KS for a power-law model graph w it h ex p on en t 2.2. The le f t figure is the es t im a t ion w it h the RR st r a t egy w it h a m in im u m g r ea t e r than 0.1. The r igh t figure is the es t im a t ion w it h the PP st r a t egy w it h a minimum less than 0 . 1

6.2. Power-law gr a ph sIn Table 6 and Figure 8, we give the resu lt of our exp er im ent s for p owe r-law graphs

wit h exponent s b et ween 2.10 and 2.40 .Again, the m et hod succeeds in deciding the a pp ropr ia t e t yp e of the graph : wi t h PP,

the value of the KS test is smaller than wit h RR. The est im a t e m′ is reasonable, but

lower than the a ct ual va lue.

6.3. Real-world gr a ph sFinally, we apply our m et hod on some real world graph, see Table 7 and Figure 9.

Table 7: R esu lt s for rea l-wo r ld graph s

RR PP m m

Sk itt er 0.109 0 .091 12822 7683Rada r-Or lan t o 0 .150 0 .435 48516 49450Rada r-J ap on 0.067 0 .162 77545 79041

Rada r-Cm 0.051 0 .225 37938 49289

7/31/2019 Deciding on the Type of the Degree Distribution of a Graph from Traceroute-like Measurements

http://slidepdf.com/reader/full/deciding-on-the-type-of-the-degree-distribution-of-a-graph-from-traceroute-like 15/17

International Journal of Computer Networks & Communications (IJCNC) Vol.4, No.3, May 2012

165

Figure 9: KS f o r Sk itt e r -A S g r a ph .

As before, our m et hod succeeds in deciding that the Sk itt er graph is very close to apower-law graph, while radar graphs are closer to a Poisson graph, bu t are act ua lly in

between, the b ot h cases, we ob t a in reasonable est im a t es m′ of the number of links m .

7. C o n cl u s io n

In t h is paper, we present ed a new approach to decide on the t yp e of t he degreed ist r ibu t ion of a graph when only a BFS and it s size are known. We use d iff erentst rat egie s to reco nst ruct a graph from a BFS, according to the p resup - posed t yp e of thegraph. We t hen show how t hese s t ra t egie s allow to decide b et ween Poisson and power-law degree d ist r ibu t ed graphs, wi t h exp er im ent s on random graphs as well as real-worldgraph s.

We worked on two set s of hyp ot heses: we first supposed that, in add it ion to a BFST on the n ver t ices of the unknown original graph G, we know it s number of edgesm: in t h is case we formally com pu t e the exp ec t ed d egree d ist r ibu t ion of the graphs

ob t a ined wi t h the d iff erent reco nst ruct ing s t rat egie s; moreover exp er im ent s on randomgraphs confirm that the knowledge of a BF S and m allows to recognize the t yp e of thegraph. Second, we use the sa m e reco nst ruct ion st ra t egie s wi t hou t knowing the numberof edges m, and we can st ill reasonably pred ic t the t yp e of the original graph.

The main lim it a t ion of our cont r ibu t ion is that we suppose the knowledge of acom p let e BFS of the unknown graph. This hyp ot hesis is not realist ic , t hough, sincepract ica l m easu rem en t s rather provide only paths to a sub set of the real graph vert ice s.Our main perspec t ive t heref ore is to reduce requ irem ent s on data, and designst ra t egie s needing truncated BFS (BFS un t il a cert a in level) only, and partial BFS(that do not cont a in all vert ice s of the graph). Assum ing that such m easurem ent s areavailable is much more rea list ic[22 ].

In f u t u re work, we also want to improve our m et hod by invest iga t ing var iousrefinem ent s: (i) design reco nst ruct ion s t ra t egie s t ak ing int o acco un t more s ub t lepropert ies of the BFS, such as node d ist r ibu t ion on each level, and also ot her localpropert ies such as the clust er ing coe fficient ; (ii) ext end the reco nst ruct ion st ra t egie s inthe d irec t ion of ot her t yp es of d ist r ibu t ions: a first st ep would b e to mix RR and PPst ra t egie s for mixed Poisson power-law graphs, but we also aim at invest iga t ing power-laws wi t h exponent ia l cu t -off , and ot her laws.

7/31/2019 Deciding on the Type of the Degree Distribution of a Graph from Traceroute-like Measurements

http://slidepdf.com/reader/full/deciding-on-the-type-of-the-degree-distribution-of-a-graph-from-traceroute-like 16/17

International Journal of Computer Networks & Communications (IJCNC) Vol.4, No.3, May 2012

166

Ackn owledgeme n t sThis work is partly funded by the European Commission t hrough the E U- LER

pro ject (grant 258307), part of the Future Internet Research and Exper i- mentation(FIRE) obj ec t ive of the Sevent h Framework Programme (F P 7) .

R e f e r e n ce s

[1] D. A ch liop t as, D. Kempe, A. C lau set , and C. Moore. On the bias of t racer - o u t e sampling: or,power-law degree d ist r ibu t ions in regular graphs. A CM Symposium on Theory of Computing,pages 694–703, 2005 .

[2] R. A lb er t , H. Jeong, and A.-L. Barabasi. Error and attack t ole ran ce of complex net work s. Nature, 406(6794):378–382, July 2000 .

[3] D. Alderson, L. Li, W. Willinger, and J. C. Doyle. U nd erst and ing int ernet topology:principles, models, and va lida t ion. IEEE / ACM Transactions on Networking, 13(6):1205–1218,December 2005 .

[4] A. L. Barab asi, R. A lb er t , and H. Jeong. Mean-field t heory for s ca le- f ree random net work s. Physica A: Statistical Mechanics and its App lica tions, 272(1-2):173–187, O ct ob er 1999 .

[5] M. Basseville. Dist an ce measures for signal processing and pattern recog- n it ion. SignalProcess., 18:349–369, December 1989 .

[6] N. Berger, C. Borgs, J. T. Chayes, and A. Saberi. On the spread of viruses on the int ernet . Proceedings of the sixteenth annual ACM-SIAM symposium on Discrete algorithms, pages301–310, 2005 .

[7] B. Bollobas. Random Graphs. Cambridge U n iversity Press, January 2001 . [8] K. Claffy, Y.

Hyun, K. Keys, M. Fomenkov, and D. Krioukov. Int ernet Mapping: From Art to Science.Cybersecurity Applications and Technology Conference for Homeland Security, pages 205–211,March 2009 .

[9] L. Da ll’ast a, I. Alvarez-Hamelin, A. Barrat, A. Vazquez, and A. V esp ignan i. Exploring net work swit h t racerou t e-li k e probes: Theory and s imu la t ions. Theoretical Computer Science,355(1):6–24, April 2006 .

[10] B. Donn et and T. Friedman. Internet t op olog y discovery: A survey. Com- municationsSurveys & Tutor ia ls, IEEE, 9(4):56–69, 2007 .

[11] W. T. Eadie, D. Drijard, F. E. James, M. Roos, and B. Sad ou let . S ta tistica l methods inexperimental physics. American Elsevier Publishing Co, J un e 2008 .

[12] A. Fabrikant, E. Kou t soup ias, and C. P apad imit r iou. Heu r ist ica lly Op t i- mized Trade-Offs:A New Paradigm for Power Laws in the Internet. Au- tomata, Languages and Programming,2380:781, January 2002 .

[13] M. Fa lou t sos, P. Fa lou t sos, and C. Fa lou t sos. On power-law rela t ionsh ips of the Internett op olog y. SIGCOMM: Conference on Applications, technolo- gies, architectures, andprotocols for computer communication, 29 (4) :251– 262, Oct ob er 1999 .

7/31/2019 Deciding on the Type of the Degree Distribution of a Graph from Traceroute-like Measurements

http://slidepdf.com/reader/full/deciding-on-the-type-of-the-degree-distribution-of-a-graph-from-traceroute-like 17/17

International Journal of Computer Networks & Communications (IJCNC) Vol.4, No.3, May 2012

167

[14] J. Heidemann, Y. Pradkin, R. Govindan, C. Papadopoulos, G. Bar t lett , and J. Bann ist er .Census and Survey of the Visible Internet. 8th A CM SIGCOMM conference on Internetmeasurement, pages 169–182, 2008 .

[15]D. Krioukov, F. Chung, K. Claffy, M. Fomenkov, A. Vespignani, and W. Willinger. TheWorkshop on Internet Topology (WIT) R ep or t . S IG- COMM Computer CommunicationReview, pages 69–73, Dec 2007 .

[16] B. K r ishna mur t hy and W. Willinger. What are our s t anda rds for va li- da t ion of m easurem ent -ba sed n et work ing research? ACM S IGME T RI CS Performance EvaluationReview, 36:64–69, A ugust 2008 .

[17] A. Lakhina, J. W. Byers, M. Crovella, and P. Xie. Sampling biases in IP t op olog ym easurem ent s. INFOCOM: 22th Conference on Computer and Communications Societies,1:332–341, 2003 .

[18] M. L a t ap y and C. Magnien. Complex N et work Measurem ent s: E st im a t ing the Relevance of Observed P rop er t ies. INFOCOM: 27th Conference on Computer Communications, pages1660–1668, May 2008 .

[19] L. Li, D. Alderson, W. Willinger, and J. Doyle. A first -p r incip les ap - proach tound erst and ing the int ernet ’s rou t er-le vel t op olog y. S IGCO MM : Conference on Applications,technologies, architectures, and protocols for computer communications, 34(4):3–14, Oct ob er 2004 .

[20] C. Magnien, M. L a t ap y, and J.-L. Guillaume. Im pa ct of Random Fa ilures and A tt ack s onPoisson and Power-Law Random N et work s. ACM Com- puting Surveys, Aug 2009 .

[21] D. Magoni and J.-J. P an siot . Influence of N et work Topology on P rot ocol Si mu la t ion. In terna t iona l Conference on Networking, 2093:762–770, J un e 2001 .

[22] F. Ou edraogo , C. Magnien, and M. L a t ap y. A radar for the int ernet . 1st In terna tiona l Workshop on Analysis of Dynamic Networks, 2008 .

[23] R. P ast or -sa t orr as and A.Vespignani. Evolution and Structure of the In ter - net: A StatisticalPhysics Approach. Cambridge U n iversity Press, 2004 .

[24] Y. Shavitt and E. Shir. DIMES: let the int ernet measure it self . SIGCO MM ComputerCommunication Review, 35(5):71–74, 2005 .

[25] F. Viger, A. Barrat, L. Da ll’As t a, C. H. Zhang, and E. D. Kolaczyk. W ha t is the real size of a sampled net work? The case of the Internet. P hysica l Review E, 75 (5) :056111 + , May 2007 .

[26] F. Viger and M. L a t ap y. E fficient and simple genera t ion of random s im- ple conn ect edgraphs wit h prescribed degree sequence. 11th In terna tiona l Computing and CombinatoricsConference, 3595:440–449, 2005 .

[27] W. Willinger, D. Alderson, and J. C. Doyle. M a t hem a t ics and the Int er - net : A Source of Enormous Confusion and Great P ot ent ia l. Notices of the American Mathematical Society, 56,May 2009 .