26
Primitives for Active Internet Topology Mapping: Toward High-Frequency Characterization Robert Beverly, Arthur Berger * , Geoffrey Xie Naval Postgraduate School * MIT/Akamai February 9, 2011 CAIDA Workshop on Active Internet Measurements R. Beverly, A. Berger, G. Xie (NPS) Primitives for Active Topology AIMS-3 1 / 22

Primitives for Active Internet Topology Mapping: Toward ...€¦ · Primitives for Active Internet Topology Mapping: Toward High-Frequency Characterization Robert Beverly, Arthur

  • Upload
    others

  • View
    10

  • Download
    0

Embed Size (px)

Citation preview

Page 1: Primitives for Active Internet Topology Mapping: Toward ...€¦ · Primitives for Active Internet Topology Mapping: Toward High-Frequency Characterization Robert Beverly, Arthur

Primitives for Active Internet Topology Mapping:Toward High-Frequency Characterization

Robert Beverly, Arthur Berger∗, Geoffrey Xie

Naval Postgraduate School∗MIT/Akamai

February 9, 2011

CAIDA Workshop on Active Internet Measurements

R. Beverly, A. Berger, G. Xie (NPS) Primitives for Active Topology AIMS-3 1 / 22

Page 2: Primitives for Active Internet Topology Mapping: Toward ...€¦ · Primitives for Active Internet Topology Mapping: Toward High-Frequency Characterization Robert Beverly, Arthur

The Problem Motivation

Internet Topology

Long-standing question: What is the topology of the Internet?

Difficult to answer – Internet is:

A large, complex distributed system (organism)

Non-stationary (in time)

Difficult to observe, multi-party (information hiding)

Poorly instrumented (not part of original design)

⇒ Poorly understood topology (interface, router, or AS level)

R. Beverly, A. Berger, G. Xie (NPS) Primitives for Active Topology AIMS-3 2 / 22

Page 3: Primitives for Active Internet Topology Mapping: Toward ...€¦ · Primitives for Active Internet Topology Mapping: Toward High-Frequency Characterization Robert Beverly, Arthur

The Problem Motivation

Internet Topology

Long-standing question: What is the topology of the Internet?

Difficult to answer – Internet is:

A large, complex distributed system (organism)

Non-stationary (in time)

Difficult to observe, multi-party (information hiding)

Poorly instrumented (not part of original design)

⇒ Poorly understood topology (interface, router, or AS level)

R. Beverly, A. Berger, G. Xie (NPS) Primitives for Active Topology AIMS-3 2 / 22

Page 4: Primitives for Active Internet Topology Mapping: Toward ...€¦ · Primitives for Active Internet Topology Mapping: Toward High-Frequency Characterization Robert Beverly, Arthur

The Problem Challenges

What is the topology of the Internet?

Why care?

Network Robustness: to failure, to attacks, and how to bestimprove. (antithesis – how to mount attacks)

Impact on Research: network modeling, routing protocolvalidation, new architectures, Internet evolution, etc.

Easy to get wrong (see e.g. “What are our standards for validationof measurement-based networking research?” [KW08])

These challenges and opportunities are well-known. We bring somenovel insights to bear on the problem.

R. Beverly, A. Berger, G. Xie (NPS) Primitives for Active Topology AIMS-3 3 / 22

Page 5: Primitives for Active Internet Topology Mapping: Toward ...€¦ · Primitives for Active Internet Topology Mapping: Toward High-Frequency Characterization Robert Beverly, Arthur

The Problem Challenges

Our Work

Our focus:

Active probing from a fixed set of vantage points

High-frequency, high-fidelity continuous characterizationUse external knowledge and adaptive sampling to solve:

Which destinations to probeHow/where to perform the probe

This Talk:1 Characterize production topology mapping systems2 Develop/analyze new primitives for active topology discovery

R. Beverly, A. Berger, G. Xie (NPS) Primitives for Active Topology AIMS-3 4 / 22

Page 6: Primitives for Active Internet Topology Mapping: Toward ...€¦ · Primitives for Active Internet Topology Mapping: Toward High-Frequency Characterization Robert Beverly, Arthur

The Problem Measurement Techniques

Archipelago/Skitter/iPlane

Production Topology Measurement

Ark/Skitter (CAIDA), iPlane (UW)

Multiple days and significant resources for complete cycle

Ark probing strategy:

IPv4 space divided into /24’s; partitioned across ∼ 41 monitors

From each /24, select a single address at random to probe

Probe == Scamper [L10]; record router interfaces on forward path

A “cycle” == probes to all routed /24’s

Investigate one vantage point (Jan, 2010):

Ark iPlaneTraces 263K 150KProbes 4.4M 2.5MPrefixes 55K 30K

R. Beverly, A. Berger, G. Xie (NPS) Primitives for Active Topology AIMS-3 5 / 22

Page 7: Primitives for Active Internet Topology Mapping: Toward ...€¦ · Primitives for Active Internet Topology Mapping: Toward High-Frequency Characterization Robert Beverly, Arthur

The Problem Measurement Techniques

Path-pair Distance Metric

Q1: How similar are traceroutes to the same destination BGP prefix?

Use Levenshtein “edit” distance DP algorithm

Determine the minimum number of edits (insert, delete, substitute)to transform one string into another

e.g. “robert” → “robber” = 2

We use: Σ = {0,1, . . . ,232 − 1}

Each unsigned 32-bit IP address along traceroute paths ∈ Σ

ED=2129.186.6.251 129.186.254.131 192.245.179.52 4.53.34.13

129.186.6.251 192.245.179.52 4.69.145.12

R. Beverly, A. Berger, G. Xie (NPS) Primitives for Active Topology AIMS-3 6 / 22

Page 8: Primitives for Active Internet Topology Mapping: Toward ...€¦ · Primitives for Active Internet Topology Mapping: Toward High-Frequency Characterization Robert Beverly, Arthur

The Problem Measurement Techniques

Path-pair Distance Metric

Q1: How similar are traceroutes to the same destination BGP prefix?

Use Levenshtein “edit” distance DP algorithm

Determine the minimum number of edits (insert, delete, substitute)to transform one string into another

e.g. “robert” → “robber” = 2

We use: Σ = {0,1, . . . ,232 − 1}

Each unsigned 32-bit IP address along traceroute paths ∈ Σ

ED=2129.186.6.251 129.186.254.131 192.245.179.52 4.53.34.13

129.186.6.251 192.245.179.52 4.69.145.12

R. Beverly, A. Berger, G. Xie (NPS) Primitives for Active Topology AIMS-3 6 / 22

Page 9: Primitives for Active Internet Topology Mapping: Toward ...€¦ · Primitives for Active Internet Topology Mapping: Toward High-Frequency Characterization Robert Beverly, Arthur

The Problem Measurement Techniques

Path-pair Distance Metric

0

0.1

0.2

0.3

0.4

0.5

0.6

0.7

0.8

0.9

1

0 5 10 15 20 25

Cum

ulat

ive

Fra

ctio

n of

Pat

h P

airs

Levenshtein Edit Distance

Intra-BGP Prefix (Ark)Intra-BGP Prefix (iPlane)

Random Prefix Pair

Q1: How similar aretraceroutes to thesame destination BGPprefix?

∼60% of traces todestinations insame BGP prefixhave ED ≤ 3

Fewer than 50% ofrandom traceshave ED ≤ 10

R. Beverly, A. Berger, G. Xie (NPS) Primitives for Active Topology AIMS-3 7 / 22

Page 10: Primitives for Active Internet Topology Mapping: Toward ...€¦ · Primitives for Active Internet Topology Mapping: Toward High-Frequency Characterization Robert Beverly, Arthur

The Problem Measurement Techniques

Path-pair Distance Metric

0

0.1

0.2

0.3

0.4

0.5

0.6

0.7

0.8

0.9

1

0 5 10 15 20 25

Cum

ulat

ive

Fra

ctio

n of

Pat

h P

airs

Levenshtein Edit Distance

Intra-BGP Prefix (Ark)Intra-BGP Prefix (iPlane)

Random Prefix Pair

Q1: How similar aretraceroutes to thesame destination BGPprefix?

∼60% of traces todestinations insame BGP prefixhave ED ≤ 3

Fewer than 50% ofrandom traceshave ED ≤ 10

Confirms our intuition

R. Beverly, A. Berger, G. Xie (NPS) Primitives for Active Topology AIMS-3 7 / 22

Page 11: Primitives for Active Internet Topology Mapping: Toward ...€¦ · Primitives for Active Internet Topology Mapping: Toward High-Frequency Characterization Robert Beverly, Arthur

The Problem Measurement Techniques

Edit Distance

Q2: How much path variance is due to the last-hop AS?

Intuitively, number of potential paths exponential in the depth

More information gain at the end of the traceroute?

Rtr

Rtr

Rtr

Rtr

InternetMonitor Rtr Rtr

Rtr

Rtr

R. Beverly, A. Berger, G. Xie (NPS) Primitives for Active Topology AIMS-3 8 / 22

Page 12: Primitives for Active Internet Topology Mapping: Toward ...€¦ · Primitives for Active Internet Topology Mapping: Toward High-Frequency Characterization Robert Beverly, Arthur

The Problem Measurement Techniques

Edit Distance

0.1

0.2

0.3

0.4

0.5

0.6

0.7

0.8

0.9

1

0 5 10 15 20 25

Cum

ulat

ive

Fra

ctio

n of

Pat

h P

airs

Levenshtein Edit Distance (last-hop AS removed)

Intra-BGP Prefix (Ark)Intra-BGP Prefix (iPlane)

Random Prefix Pair

Q2: Variance due tothe last-hop AS?

Lob off last AS

Answer: lots!

For ∼ 70% ofprobes to sameprefix, we get noadditionalinformationbeyond leaf AS

R. Beverly, A. Berger, G. Xie (NPS) Primitives for Active Topology AIMS-3 9 / 22

Page 13: Primitives for Active Internet Topology Mapping: Toward ...€¦ · Primitives for Active Internet Topology Mapping: Toward High-Frequency Characterization Robert Beverly, Arthur

The Problem Measurement Techniques

Edit Distance

0.1

0.2

0.3

0.4

0.5

0.6

0.7

0.8

0.9

1

0 5 10 15 20 25

Cum

ulat

ive

Fra

ctio

n of

Pat

h P

airs

Levenshtein Edit Distance (last-hop AS removed)

Intra-BGP Prefix (Ark)Intra-BGP Prefix (iPlane)

Random Prefix Pair

Q2: Variance due tothe last-hop AS?

Lob off last AS

Answer: lots!

For ∼ 70% ofprobes to sameprefix, we get noadditionalinformationbeyond leaf AS

Significant packetsavings possible(DoubleTree)

R. Beverly, A. Berger, G. Xie (NPS) Primitives for Active Topology AIMS-3 9 / 22

Page 14: Primitives for Active Internet Topology Mapping: Toward ...€¦ · Primitives for Active Internet Topology Mapping: Toward High-Frequency Characterization Robert Beverly, Arthur

Methodology

Adaptive Probing Methodology

Meta-Conclusion: adaptive probing a useful strategy

We develop three primitives:

1 Subnet Centric Probing2 Vantage Point Spreading3 Interface Set Cover

These primitives leverage adaptive sampling, external knowledge(e.g., common subnetting structure, BGP, etc), and data from

prior cycles to maximize efficiency and information gain of each probe.

R. Beverly, A. Berger, G. Xie (NPS) Primitives for Active Topology AIMS-3 10 / 22

Page 15: Primitives for Active Internet Topology Mapping: Toward ...€¦ · Primitives for Active Internet Topology Mapping: Toward High-Frequency Characterization Robert Beverly, Arthur

Methodology

Adaptive Probing Methodology

We develop three primitives:

1 Subnet Centric Probing2 Vantage Point Spreading3 Interface Set Cover

Best explained by understanding sources of path diversity:

D2

D3AS Ingress

D1

Vantage Point

Vantage Point

Vantage Point

Vantage Point

R. Beverly, A. Berger, G. Xie (NPS) Primitives for Active Topology AIMS-3 11 / 22

Page 16: Primitives for Active Internet Topology Mapping: Toward ...€¦ · Primitives for Active Internet Topology Mapping: Toward High-Frequency Characterization Robert Beverly, Arthur

Methodology

Subnet Centric Probing

Granularity vs. Scaling

∼ 232−1 possible destinations (2.9B from Jan 2010 routeviews)

What granularity? /24’s? Prefixes? AS’s?

Subnet Centric Probing

D2

D3AS Ingress

D1

Vantage Point

From a single vantage point, no path diversity into the AS

Path diversity due to AS-internal structure

R. Beverly, A. Berger, G. Xie (NPS) Primitives for Active Topology AIMS-3 12 / 22

Page 17: Primitives for Active Internet Topology Mapping: Toward ...€¦ · Primitives for Active Internet Topology Mapping: Toward High-Frequency Characterization Robert Beverly, Arthur

Methodology

Subnet Centric Probing

D2

D3AS Ingress

D1

Vantage Point

Goal: adapt granularity, discover internal structure

Leverage BGP as coarse structure

Follow least common prefix: iteratively pick destinations withinprefix that are maximally distant (in subnetting sense)

Address “distance” is misleading: e.g. 18.255.255.100 vs.19.0.0.4 vs. 18.0.0.5

Stopping criterion: ED(ti , ti+1) ≤ τ ; τ = 3R. Beverly, A. Berger, G. Xie (NPS) Primitives for Active Topology AIMS-3 13 / 22

Page 18: Primitives for Active Internet Topology Mapping: Toward ...€¦ · Primitives for Active Internet Topology Mapping: Toward High-Frequency Characterization Robert Beverly, Arthur

Methodology

Subnet Centric Probing

1

10

100

1000

10000

100000

1 10 100 1000

Co

un

t

Degree

Prefix DirectedAS Directed

Subnet CentricArk (Ground Truth)

0.1

0.2

0.3

0.4

0.5

0.6

0.7

0.8

0.9

1

Verticies EdgesD

iffe

ren

ce

fro

m A

rk G

rou

nd

Tru

th

Subnet-CentricPrefix-Directed

AS-Directed

Inferred degree distribution well-approximates ground-truth

Captures ≥ 90% of the vertex andedge fidelity

R. Beverly, A. Berger, G. Xie (NPS) Primitives for Active Topology AIMS-3 14 / 22

Page 19: Primitives for Active Internet Topology Mapping: Toward ...€¦ · Primitives for Active Internet Topology Mapping: Toward High-Frequency Characterization Robert Beverly, Arthur

Methodology

Subnet Centric Probing

0.1

0.2

0.3

0.4

0.5

0.6

0.7

0.8

0.9

1

Verticies Edges

Diffe

ren

ce

fro

m A

rk G

rou

nd

Tru

th

Subnet-CentricPrefix-Directed

AS-Directed

0

0.1

0.2

0.3

0.4

0.5

0.6

0.7

Probes(load)D

iffe

ren

ce

fro

m A

rk G

rou

nd

Tru

th

Subnet-CentricPrefix-Directed

AS-Directed

Captures ≥ 90% of the vertex andedge fidelity

Using ∼ 60% of ground-truth load

R. Beverly, A. Berger, G. Xie (NPS) Primitives for Active Topology AIMS-3 15 / 22

Page 20: Primitives for Active Internet Topology Mapping: Toward ...€¦ · Primitives for Active Internet Topology Mapping: Toward High-Frequency Characterization Robert Beverly, Arthur

Methodology

Vantage Point Spreading

Vantage Point Spreading

D2

D3AS Ingress

D1

Vantage Point

Vantage Point

Vantage Point

Vantage Point

Discover AS ingress points and paths to the AS via multiplevantage points

Random assignment of destinations to vantage points is wasteful

E.g. empirically, the 16 /24’s in a /20 prefix are hit on average by12 unique VPs

R. Beverly, A. Berger, G. Xie (NPS) Primitives for Active Topology AIMS-3 16 / 22

Page 21: Primitives for Active Internet Topology Mapping: Toward ...€¦ · Primitives for Active Internet Topology Mapping: Toward High-Frequency Characterization Robert Beverly, Arthur

Methodology

Vantage Point Spreading

Vantage Point Spreading

D2

D3AS Ingress

D1

Vantage Point

Vantage Point

Vantage Point

Vantage Point

Using BGP knowledge, maximize the number of distinct VPsper-prefix

Note, this is complimentary to SCP

R. Beverly, A. Berger, G. Xie (NPS) Primitives for Active Topology AIMS-3 17 / 22

Page 22: Primitives for Active Internet Topology Mapping: Toward ...€¦ · Primitives for Active Internet Topology Mapping: Toward High-Frequency Characterization Robert Beverly, Arthur

Methodology

Vantage Point Spreading

0

50

100

150

200

250

300

350

400

0 5 10 15 20 25 30 35 40

Un

iqu

e In

terf

ace

s D

isco

ve

red

Number of VPs for Destination

VP Influencey=10x

0

2000

4000

6000

8000

10000

12000

/22/21

/20

Ve

rtic

es

Size of Prefix from which /24s Drawn

VP SelectionSpreading

RandomSingle

Diminishing return of vantagepoint influence

Vertices in resulting graph as com-pared to random: ∼ 6% increase“for free.”

R. Beverly, A. Berger, G. Xie (NPS) Primitives for Active Topology AIMS-3 18 / 22

Page 23: Primitives for Active Internet Topology Mapping: Toward ...€¦ · Primitives for Active Internet Topology Mapping: Toward High-Frequency Characterization Robert Beverly, Arthur

Methodology

Interface Set Cover

Interface Set Cover

As shown in preceding analysis, full traces very inefficient

Perform greedy minimum set cover approximation (NP-complete)

Select subset of prior round probe packets for current round

D2

D3

D1

Vantage Point

Vantage Point

Vantage Point

Vantage Point

R. Beverly, A. Berger, G. Xie (NPS) Primitives for Active Topology AIMS-3 19 / 22

Page 24: Primitives for Active Internet Topology Mapping: Toward ...€¦ · Primitives for Active Internet Topology Mapping: Toward High-Frequency Characterization Robert Beverly, Arthur

Methodology

Interface Set Cover

Interface Set Cover

Generalizes DoubleTree [DRFC05] without parametrization

Efficient

Inherently multi-round

Additional probing for validation mis-matches (e.g. load balancing,new paths)

D2

D3

D1

Vantage Point

Vantage Point

Vantage Point

Vantage Point

R. Beverly, A. Berger, G. Xie (NPS) Primitives for Active Topology AIMS-3 20 / 22

Page 25: Primitives for Active Internet Topology Mapping: Toward ...€¦ · Primitives for Active Internet Topology Mapping: Toward High-Frequency Characterization Robert Beverly, Arthur

Methodology

Interface Set Cover

0

0.01

0.02

0.03

0.04

0.05

0.06

0.07

0.08

0.09

0.1

1 2 3 4 5 6 7 8 9 10 11

Fra

ctio

n o

f M

isse

d In

terf

ace

s

Cycle Separation (Days)

Full Trace SetCoverISC

0.1

0.2

0.3

0.4

0.5

0.6

0.7

1 2 3 4 5 6 7 8 9 10 11

Lo

ad

Ra

tio

to

Fu

ll T

race

rou

tes

Cycle Separation (Days)

Full Trace SetCoverISC

20K random IP destinations eachday over a two-week period, frac-tion of missing interface using ISC

Uses ≤ 20% of the full probingload (∼ 30% of full trace set cover)

R. Beverly, A. Berger, G. Xie (NPS) Primitives for Active Topology AIMS-3 21 / 22

Page 26: Primitives for Active Internet Topology Mapping: Toward ...€¦ · Primitives for Active Internet Topology Mapping: Toward High-Frequency Characterization Robert Beverly, Arthur

Summary

Summary

Take-Aways:

Deconstructed Ark/iPlane topology tracing as case studyDeveloped primitives for faster, more efficient probing:

Subnet Centric Probing, Interface Set Cover, Vantage PointSpreadingSignificant load savings without sacrificing fidelity

Future

Combining our primitives on production system

Refine ISC “change-driven” logic

Build a better Internet scope to detect small-scale dynamics

Thanks!

Questions?

R. Beverly, A. Berger, G. Xie (NPS) Primitives for Active Topology AIMS-3 22 / 22