Upload
ralf-boone
View
216
Download
0
Tags:
Embed Size (px)
Citation preview
Trends in Internet MeasurementTrends in Internet Measurement
Paul BarfordAssistant ProfessorComputer ScienceUniversity of Wisconsin
Fall, 2004
wail.cs.wisc.edu 2
MotivationMotivation• The Internet is gigantic, complex, and constantly evolving
– Began as something quite simple
• Infrequent use of “scientific method” in network research– Historical artifact– Lack of inherent measurement capability– Decentralization and privacy concerns
• Recognition of importance of empirically-based research– Critical trend over past five years (Internet Measurement Conf.)
• Good research hypothesis + good data + good analysis = good research results– Focus of this talk: “good data” - where we’ve been and where
we’re going
wail.cs.wisc.edu 3
In the beginning…In the beginning…
• Measurement was part of the original Arpanet in ’70– Kleinrock’s Network Measurement Center at UCLA– Resources in the network were reserved for measurement
• Formation of Network Measurement Group in ’72– Rfc 323 – who is involved and what is important
• First network measurement publication in ’74– “On Measured Behavior of the ARPA Network,” Kleinrock
and Taylor
• No significant difference between operations a research– Size kept things tractable
wail.cs.wisc.edu 4
From ARPAnet to InternetFrom ARPAnet to Internet• In the 80’s, measurement-based publications increased
– “The Experimental Literature of the Internet: An Annotated Bibliography”,J. Mogul, ’88.
• Rfc 1262 – Guidelines for Internet Measurement Activities, 1991– V. Cerf, “Measurement of the Internet is critical for
future development, evolution and deployment planning.”
• What happened?
• “On the Self-Similar Nature of Ethernet Traffic”, Leland et al., ‘94.– Novel measurement combined with thorough analysis– A transition point between operational and research measurement
(?)
wail.cs.wisc.edu 5
Gold in the streets in the 90’sGold in the streets in the 90’s• Lots of juicy problems garnered much attention in 90’s
– Transport, ATM, QoS, Multicast, Lookup scalability, etc.
• The rise of simulation (aaagggghhhhh!!!!)• Measurement activity didn’t die…
– Research focus on Internet behavior and structure• Self-similarity established as an invariant in series of studies
• Paxson’s NPD study from ’93 to ’97
• Routing (BGP) studies by Labovitz et al.
• Structural properties (the Internet as a graph) by Govindan et al.
– Organizations focused on measurement• National Laboratory for Applied Network Research (‘95)
• Cooperative Association for Internet Data Analysis (‘97)
wail.cs.wisc.edu 6
Measurement must be hard…Measurement must be hard…
• Well, not really…lot’s of folks are measuring the Internet– See CAIDA or SLAC pages– Business get paid to measure the Internet
• Lot’s of tools are available for Internet measurement– See CAIDA and SLAC pages– Dedicated hardware– Public infrastructures
wail.cs.wisc.edu 7
So, what’s the problem?So, what’s the problem?• “Strategies for Sound Internet Measurement,” Paxson
‘04.– Lack consistent methods for measurement-based experiments– Problems faced in other sciences years ago
• Issues of scale in every direction– What is representative?– HUGE, HIGH-DIMENSION date sets make things break
• Disconnect between measurements for operations and measurements for research– Operational interests: SLA’s, billing, privacy, …– Research interests: network-wide properties
wail.cs.wisc.edu 8
Current measurement trendsCurrent measurement trends
1. Open end host network measurement infrastructures• Available for a variety of uses
2. Large public data repositories– Domain specific– Suitable for longitudinal studies
3. Network telescope monitors• Malicious traffic
4. Laboratory-based testbeds• Bench environments
5. Standard anonymization methods• Address privacy concerns
wail.cs.wisc.edu 9
End host infrastructuresEnd host infrastructures
• Paxson’s NPD study; an end-host prototype– Accounts on 35 systems distributed throughout the Internet– Active, end-to-end measurement focus
• National Internet Measurement Infrastructure (NIMI) and others evolved from NPD– Perhaps a bit too ambitious at the time
• Today’s end host infrastructure “success story”: Planetlab
wail.cs.wisc.edu 10
PlanetLab overviewPlanetLab overview
• Collaboration between Intel, Princeton, Berkeley, Washington, others starting in early ‘02
• Began as a distributed, virtualized system project– Peer-to-peer overlay systems were getting hot– Applications BOF at SIGCOMM ‘02 had only 6/80 people
• Systems were donated to an initial set of sites in ‘02– Most major universities and Abilene POPs
• Available to members who host systems• Developers have done a fine job creating a
management environment– Isolates individual experiments from each other
wail.cs.wisc.edu 11
PlanetLab sitesPlanetLab sites
QuickTime™ and aTIFF (Uncompressed) decompressor
are needed to see this picture.
449 nodes at 209 sites: source www.planet-lab.org
wail.cs.wisc.edu 12
End host infrastructures & SPEnd host infrastructures & SP
• End host infrastructures are primarily for active measurement– Generate probes and measure responses
• Problem domains– Network structure via tomography– Network distance estimation– End-to-end resource estimation– End-to-end packet dynamics
wail.cs.wisc.edu 13
Large public data repositoriesLarge public data repositories
• First data repository - Internet Traffic Archive (LBL)– Hodgepodge of traces from various projects
• Current projects are more focused– Passive Measurement and Analysis Project
– Packet traces from high performance monitors
– Abilene Observatory– Flow traces from the Internet2 backbone routers
– Route views/RIPE– BGP routing updates from ~150 networks
– Datasets for network security– DHS project focused on making attack and intrusion data
available for research
wail.cs.wisc.edu 14
Data repositories & SPData repositories & SP
• Most of the data in aforementioned repositories was gathered via passive means– Counters/logs on devices– Installed instrumentation– Configuration to measure specific traffic (BGP)
• Problem domains– Anomaly detection– Traffic dynamics– Routing dynamics
wail.cs.wisc.edu 15
Network telescopesNetwork telescopes
• Simple observation 1: number globally routed IP addresses <> number of live hosts– Network address translation– Networks (ranges of IP addresses) are routed
• Simple observation 2: traffic to/from standard services should only arrive at live hosts– Misconfigurations and malicious traffic are the exceptions
• Network telescope = traffic monitor on routed but otherwise unused IP addresses– This traffic is otherwise usually dropped at border router
wail.cs.wisc.edu 16
So, what’s the point?So, what’s the point?
• Bad guys don’t know which IP addresses in a network a live– Random and systematic scanning commonly used– Spoofed source addresses are used in DoS attacks– Misconfigurations are fairly rare
• Ergo, network telescopes can provide important perspective on malicious traffic– Most importantly, a clean signal
• Implementation is fairly simple– Honeypots of live systems or honeypot specific monitors
wail.cs.wisc.edu 17
What do we see?What do we see?
• “Characteristics of Internet Background Radiation,” Yegneswaran et al., ‘04.– Active monitors (small, medium, large) at 3 sites
• Traffic is dominated by activity on common services– Worms and probes targeting HTTP and NetBIOS– The focus of our study
• Traffic is highly variable and diverse– Perspectives from 3 monitors are quite different
• Traffic mutates rapidly• Much deeper analysis is necessary
wail.cs.wisc.edu 18
Network telescopes & SPNetwork telescopes & SP
• An emerging, rich source of data• Network security is critically important• Problem domains
– Outbreak and attack detection– Collaborative monitoring– Dynamic quarantine– (Misconfiguration analysis)
wail.cs.wisc.edu 19
Laboratory-based testbedsLaboratory-based testbeds
• Most scientific disciplines commonly use bench environments to conduct research– Control– Instrumentation– Repeatability
• Network research community has relied on analytic modeling, simulation and empirical measurement
• Openly available bench environments for network research are emerging– EMULAB at Utah - collection of end hosts– Wisconsin Advanced Internet Lab - collection of routers and
end hosts
wail.cs.wisc.edu 20
Laboratory testbeds & SPLaboratory testbeds & SP
• The effectiveness of bench research hinges on research hypothesis and experimental design– Aspects of scale (emergent behavior) are difficult to capture
• Problem domains– Inference tool analysis– Protocol (implementation) analysis– Anomaly detection– Network system evaluation
wail.cs.wisc.edu 21
Data anonymizationData anonymization
• Lots of people measure, most are scared s*!#less about sharing data– This is a legal issue– No standards (sure payloads are off limits, but addresses?)– Don’t ask, don’t tell?
• The community is developing tools for trace anonymization– “A High-Level Programming Environment for Packet Trace
Anonymization and Transformation,” Pang et al., ‘03.– Prefix preserving address anonymization– Payload hashing
• Probably no direct SP application– But, implications vis-à-vis future data availability
wail.cs.wisc.edu 22
SummarySummary
• Trends over past 30 years– Divergence of research and operations– Decline of importance of measurement in research– Empirical study of the Internet as an artifact
• Current trends– Rise of measurement as a discipline– Open infrastructures and network testbeds– Large-scale domain specific data repositories– Novel measurement methods
• Future trends– ??– Embedded measurement systems