Upload
others
View
9
Download
0
Embed Size (px)
Citation preview
Primitives for Active Internet Topology Mapping:Toward High-Frequency Characterization
Robert Beverly, Arthur Berger∗, Geoffrey Xie
Naval Postgraduate School∗MIT/Akamai
February 9, 2011
CAIDA Workshop on Active Internet Measurements
R. Beverly, A. Berger, G. Xie (NPS) Primitives for Active Topology AIMS-3 1 / 22
The Problem Motivation
Internet Topology
Long-standing question: What is the topology of the Internet?
Difficult to answer – Internet is:
A large, complex distributed system (organism)
Non-stationary (in time)
Difficult to observe, multi-party (information hiding)
Poorly instrumented (not part of original design)
⇒ Poorly understood topology (interface, router, or AS level)
R. Beverly, A. Berger, G. Xie (NPS) Primitives for Active Topology AIMS-3 2 / 22
The Problem Motivation
Internet Topology
Long-standing question: What is the topology of the Internet?
Difficult to answer – Internet is:
A large, complex distributed system (organism)
Non-stationary (in time)
Difficult to observe, multi-party (information hiding)
Poorly instrumented (not part of original design)
⇒ Poorly understood topology (interface, router, or AS level)
R. Beverly, A. Berger, G. Xie (NPS) Primitives for Active Topology AIMS-3 2 / 22
The Problem Challenges
What is the topology of the Internet?
Why care?
Network Robustness: to failure, to attacks, and how to bestimprove. (antithesis – how to mount attacks)
Impact on Research: network modeling, routing protocolvalidation, new architectures, Internet evolution, etc.
Easy to get wrong (see e.g. “What are our standards for validationof measurement-based networking research?” [KW08])
These challenges and opportunities are well-known. We bring somenovel insights to bear on the problem.
R. Beverly, A. Berger, G. Xie (NPS) Primitives for Active Topology AIMS-3 3 / 22
The Problem Challenges
Our Work
Our focus:
Active probing from a fixed set of vantage points
High-frequency, high-fidelity continuous characterizationUse external knowledge and adaptive sampling to solve:
Which destinations to probeHow/where to perform the probe
This Talk:1 Characterize production topology mapping systems2 Develop/analyze new primitives for active topology discovery
R. Beverly, A. Berger, G. Xie (NPS) Primitives for Active Topology AIMS-3 4 / 22
The Problem Measurement Techniques
Archipelago/Skitter/iPlane
Production Topology Measurement
Ark/Skitter (CAIDA), iPlane (UW)
Multiple days and significant resources for complete cycle
Ark probing strategy:
IPv4 space divided into /24’s; partitioned across ∼ 41 monitors
From each /24, select a single address at random to probe
Probe == Scamper [L10]; record router interfaces on forward path
A “cycle” == probes to all routed /24’s
Investigate one vantage point (Jan, 2010):
Ark iPlaneTraces 263K 150KProbes 4.4M 2.5MPrefixes 55K 30K
R. Beverly, A. Berger, G. Xie (NPS) Primitives for Active Topology AIMS-3 5 / 22
The Problem Measurement Techniques
Path-pair Distance Metric
Q1: How similar are traceroutes to the same destination BGP prefix?
Use Levenshtein “edit” distance DP algorithm
Determine the minimum number of edits (insert, delete, substitute)to transform one string into another
e.g. “robert” → “robber” = 2
We use: Σ = {0,1, . . . ,232 − 1}
Each unsigned 32-bit IP address along traceroute paths ∈ Σ
ED=2129.186.6.251 129.186.254.131 192.245.179.52 4.53.34.13
129.186.6.251 192.245.179.52 4.69.145.12
R. Beverly, A. Berger, G. Xie (NPS) Primitives for Active Topology AIMS-3 6 / 22
The Problem Measurement Techniques
Path-pair Distance Metric
Q1: How similar are traceroutes to the same destination BGP prefix?
Use Levenshtein “edit” distance DP algorithm
Determine the minimum number of edits (insert, delete, substitute)to transform one string into another
e.g. “robert” → “robber” = 2
We use: Σ = {0,1, . . . ,232 − 1}
Each unsigned 32-bit IP address along traceroute paths ∈ Σ
ED=2129.186.6.251 129.186.254.131 192.245.179.52 4.53.34.13
129.186.6.251 192.245.179.52 4.69.145.12
R. Beverly, A. Berger, G. Xie (NPS) Primitives for Active Topology AIMS-3 6 / 22
The Problem Measurement Techniques
Path-pair Distance Metric
0
0.1
0.2
0.3
0.4
0.5
0.6
0.7
0.8
0.9
1
0 5 10 15 20 25
Cum
ulat
ive
Fra
ctio
n of
Pat
h P
airs
Levenshtein Edit Distance
Intra-BGP Prefix (Ark)Intra-BGP Prefix (iPlane)
Random Prefix Pair
Q1: How similar aretraceroutes to thesame destination BGPprefix?
∼60% of traces todestinations insame BGP prefixhave ED ≤ 3
Fewer than 50% ofrandom traceshave ED ≤ 10
R. Beverly, A. Berger, G. Xie (NPS) Primitives for Active Topology AIMS-3 7 / 22
The Problem Measurement Techniques
Path-pair Distance Metric
0
0.1
0.2
0.3
0.4
0.5
0.6
0.7
0.8
0.9
1
0 5 10 15 20 25
Cum
ulat
ive
Fra
ctio
n of
Pat
h P
airs
Levenshtein Edit Distance
Intra-BGP Prefix (Ark)Intra-BGP Prefix (iPlane)
Random Prefix Pair
Q1: How similar aretraceroutes to thesame destination BGPprefix?
∼60% of traces todestinations insame BGP prefixhave ED ≤ 3
Fewer than 50% ofrandom traceshave ED ≤ 10
Confirms our intuition
R. Beverly, A. Berger, G. Xie (NPS) Primitives for Active Topology AIMS-3 7 / 22
The Problem Measurement Techniques
Edit Distance
Q2: How much path variance is due to the last-hop AS?
Intuitively, number of potential paths exponential in the depth
More information gain at the end of the traceroute?
Rtr
Rtr
Rtr
Rtr
InternetMonitor Rtr Rtr
Rtr
Rtr
R. Beverly, A. Berger, G. Xie (NPS) Primitives for Active Topology AIMS-3 8 / 22
The Problem Measurement Techniques
Edit Distance
0.1
0.2
0.3
0.4
0.5
0.6
0.7
0.8
0.9
1
0 5 10 15 20 25
Cum
ulat
ive
Fra
ctio
n of
Pat
h P
airs
Levenshtein Edit Distance (last-hop AS removed)
Intra-BGP Prefix (Ark)Intra-BGP Prefix (iPlane)
Random Prefix Pair
Q2: Variance due tothe last-hop AS?
Lob off last AS
Answer: lots!
For ∼ 70% ofprobes to sameprefix, we get noadditionalinformationbeyond leaf AS
R. Beverly, A. Berger, G. Xie (NPS) Primitives for Active Topology AIMS-3 9 / 22
The Problem Measurement Techniques
Edit Distance
0.1
0.2
0.3
0.4
0.5
0.6
0.7
0.8
0.9
1
0 5 10 15 20 25
Cum
ulat
ive
Fra
ctio
n of
Pat
h P
airs
Levenshtein Edit Distance (last-hop AS removed)
Intra-BGP Prefix (Ark)Intra-BGP Prefix (iPlane)
Random Prefix Pair
Q2: Variance due tothe last-hop AS?
Lob off last AS
Answer: lots!
For ∼ 70% ofprobes to sameprefix, we get noadditionalinformationbeyond leaf AS
Significant packetsavings possible(DoubleTree)
R. Beverly, A. Berger, G. Xie (NPS) Primitives for Active Topology AIMS-3 9 / 22
Methodology
Adaptive Probing Methodology
Meta-Conclusion: adaptive probing a useful strategy
We develop three primitives:
1 Subnet Centric Probing2 Vantage Point Spreading3 Interface Set Cover
These primitives leverage adaptive sampling, external knowledge(e.g., common subnetting structure, BGP, etc), and data from
prior cycles to maximize efficiency and information gain of each probe.
R. Beverly, A. Berger, G. Xie (NPS) Primitives for Active Topology AIMS-3 10 / 22
Methodology
Adaptive Probing Methodology
We develop three primitives:
1 Subnet Centric Probing2 Vantage Point Spreading3 Interface Set Cover
Best explained by understanding sources of path diversity:
D2
D3AS Ingress
D1
Vantage Point
Vantage Point
Vantage Point
Vantage Point
R. Beverly, A. Berger, G. Xie (NPS) Primitives for Active Topology AIMS-3 11 / 22
Methodology
Subnet Centric Probing
Granularity vs. Scaling
∼ 232−1 possible destinations (2.9B from Jan 2010 routeviews)
What granularity? /24’s? Prefixes? AS’s?
Subnet Centric Probing
D2
D3AS Ingress
D1
Vantage Point
From a single vantage point, no path diversity into the AS
Path diversity due to AS-internal structure
R. Beverly, A. Berger, G. Xie (NPS) Primitives for Active Topology AIMS-3 12 / 22
Methodology
Subnet Centric Probing
D2
D3AS Ingress
D1
Vantage Point
Goal: adapt granularity, discover internal structure
Leverage BGP as coarse structure
Follow least common prefix: iteratively pick destinations withinprefix that are maximally distant (in subnetting sense)
Address “distance” is misleading: e.g. 18.255.255.100 vs.19.0.0.4 vs. 18.0.0.5
Stopping criterion: ED(ti , ti+1) ≤ τ ; τ = 3R. Beverly, A. Berger, G. Xie (NPS) Primitives for Active Topology AIMS-3 13 / 22
Methodology
Subnet Centric Probing
1
10
100
1000
10000
100000
1 10 100 1000
Co
un
t
Degree
Prefix DirectedAS Directed
Subnet CentricArk (Ground Truth)
0.1
0.2
0.3
0.4
0.5
0.6
0.7
0.8
0.9
1
Verticies EdgesD
iffe
ren
ce
fro
m A
rk G
rou
nd
Tru
th
Subnet-CentricPrefix-Directed
AS-Directed
Inferred degree distribution well-approximates ground-truth
Captures ≥ 90% of the vertex andedge fidelity
R. Beverly, A. Berger, G. Xie (NPS) Primitives for Active Topology AIMS-3 14 / 22
Methodology
Subnet Centric Probing
0.1
0.2
0.3
0.4
0.5
0.6
0.7
0.8
0.9
1
Verticies Edges
Diffe
ren
ce
fro
m A
rk G
rou
nd
Tru
th
Subnet-CentricPrefix-Directed
AS-Directed
0
0.1
0.2
0.3
0.4
0.5
0.6
0.7
Probes(load)D
iffe
ren
ce
fro
m A
rk G
rou
nd
Tru
th
Subnet-CentricPrefix-Directed
AS-Directed
Captures ≥ 90% of the vertex andedge fidelity
Using ∼ 60% of ground-truth load
R. Beverly, A. Berger, G. Xie (NPS) Primitives for Active Topology AIMS-3 15 / 22
Methodology
Vantage Point Spreading
Vantage Point Spreading
D2
D3AS Ingress
D1
Vantage Point
Vantage Point
Vantage Point
Vantage Point
Discover AS ingress points and paths to the AS via multiplevantage points
Random assignment of destinations to vantage points is wasteful
E.g. empirically, the 16 /24’s in a /20 prefix are hit on average by12 unique VPs
R. Beverly, A. Berger, G. Xie (NPS) Primitives for Active Topology AIMS-3 16 / 22
Methodology
Vantage Point Spreading
Vantage Point Spreading
D2
D3AS Ingress
D1
Vantage Point
Vantage Point
Vantage Point
Vantage Point
Using BGP knowledge, maximize the number of distinct VPsper-prefix
Note, this is complimentary to SCP
R. Beverly, A. Berger, G. Xie (NPS) Primitives for Active Topology AIMS-3 17 / 22
Methodology
Vantage Point Spreading
0
50
100
150
200
250
300
350
400
0 5 10 15 20 25 30 35 40
Un
iqu
e In
terf
ace
s D
isco
ve
red
Number of VPs for Destination
VP Influencey=10x
0
2000
4000
6000
8000
10000
12000
/22/21
/20
Ve
rtic
es
Size of Prefix from which /24s Drawn
VP SelectionSpreading
RandomSingle
Diminishing return of vantagepoint influence
Vertices in resulting graph as com-pared to random: ∼ 6% increase“for free.”
R. Beverly, A. Berger, G. Xie (NPS) Primitives for Active Topology AIMS-3 18 / 22
Methodology
Interface Set Cover
Interface Set Cover
As shown in preceding analysis, full traces very inefficient
Perform greedy minimum set cover approximation (NP-complete)
Select subset of prior round probe packets for current round
D2
D3
D1
Vantage Point
Vantage Point
Vantage Point
Vantage Point
R. Beverly, A. Berger, G. Xie (NPS) Primitives for Active Topology AIMS-3 19 / 22
Methodology
Interface Set Cover
Interface Set Cover
Generalizes DoubleTree [DRFC05] without parametrization
Efficient
Inherently multi-round
Additional probing for validation mis-matches (e.g. load balancing,new paths)
D2
D3
D1
Vantage Point
Vantage Point
Vantage Point
Vantage Point
R. Beverly, A. Berger, G. Xie (NPS) Primitives for Active Topology AIMS-3 20 / 22
Methodology
Interface Set Cover
0
0.01
0.02
0.03
0.04
0.05
0.06
0.07
0.08
0.09
0.1
1 2 3 4 5 6 7 8 9 10 11
Fra
ctio
n o
f M
isse
d In
terf
ace
s
Cycle Separation (Days)
Full Trace SetCoverISC
0.1
0.2
0.3
0.4
0.5
0.6
0.7
1 2 3 4 5 6 7 8 9 10 11
Lo
ad
Ra
tio
to
Fu
ll T
race
rou
tes
Cycle Separation (Days)
Full Trace SetCoverISC
20K random IP destinations eachday over a two-week period, frac-tion of missing interface using ISC
Uses ≤ 20% of the full probingload (∼ 30% of full trace set cover)
R. Beverly, A. Berger, G. Xie (NPS) Primitives for Active Topology AIMS-3 21 / 22
Summary
Summary
Take-Aways:
Deconstructed Ark/iPlane topology tracing as case studyDeveloped primitives for faster, more efficient probing:
Subnet Centric Probing, Interface Set Cover, Vantage PointSpreadingSignificant load savings without sacrificing fidelity
Future
Combining our primitives on production system
Refine ISC “change-driven” logic
Build a better Internet scope to detect small-scale dynamics
Thanks!
Questions?
R. Beverly, A. Berger, G. Xie (NPS) Primitives for Active Topology AIMS-3 22 / 22