Upload
marcelo-veiga-neves
View
385
Download
3
Embed Size (px)
Citation preview
MRemu: An Emula-on-‐based Framework for Datacenter Network Experimenta-on using Realis-c
MapReduce Traffic
Marcelo Veiga Neves1, Cesar A. F. De Rose1, Kostas Katrinis2
1 PUCRS, Porto Alegre, Brazil 2 IBM Research, Dublin, Ireland
Oct, 2015
Context
• Big Data & MapReduce analy-cs frameworks – Scale out to hundred or even thousands of commodity servers
– Increased network traffic volumes and mul-plicity of traffic paWerns
• Data center networks for Big Data – Scale-‐out topologies (e.g., fat-‐tree, leaf-‐spine) – Network control soZware (e.g, SDN – IPDPS’15)
Problem
• The need for a real hardware infrastructure – is oZen not a valid op-on – even when datacenter resources are available • access is not con-nuous • not prac-cal to reconfigure them in order to evaluate different network topologies and characteris-cs (e.g., bandwidth and latency)
• Alterna-ves: – Simula-on & Emula-on
Problem • Most research on data center networks do not use realis-c Big Data traffic – synthe-c traffic paWerns
• Simplified shuffle-‐like traffic paWerns – e.g., all-‐to-‐all – not consider transfer scheduling decisions, number of parallel transfers, etc.
– overlap communica-on with computa-on
• How the reported results translate to performance improvement for actual analy-cs run-mes?
4
Network Traffic in real Hadoop Applica-ons
• a
5
Network transfers
Network transfers
Proposed solu-on: MRemu
• Emula-on-‐based framework for data center network experimenta-on
• Highlights: – Ability to run a complete data center in a single server – Use of realis-c network traffic
• replay or extrapolate from execu-ons of real applica-ons in produc-on datacenters
– Mimics framework internals (e.g, transfer scheduling, phases overlaps, etc.)
– Unmodified code also run in real hardware
MRemu Architecture
7
Job Trace
HadoopJob Tracing
SynthecticJob Generator
TopologyDescription
Mininet-HiFI
TopologyBuilder
ApplicationLauncher
NetworkMonitor
Data center emulator
TaskTracker
Job TraceParser
TrafficGenerator
Hadoop MapReduce emulator
Logger
JobTracker
*Mininet can be replaced with real hardware (e.g., run in legacy clusters)
*
Evalua-on • Mininet-‐HiFi has already been validated and is widely used to reproduce networking research experiments
• Accuracy when reproducing MapReduce workloads. – Comparison with traces extracted from real job execu-ons – Two opera-ons modes: replay mode and hadoop mode
• Execu-on environment: – Shamrock datacenter, IBM Research – HiBench Benchmark Suite: Sort, Nutch, PageRank and Bayes
8
Handigol, N.; Heller, B.; Jeyakumar, V.; Lantz, B.; McKeown, N. “Reproducible Network Experiments Using Container-‐Based Emula-on”. In: Proceedings of the 8th Interna-onal Conference on Emerging Networking Experiments and Technologies, 2012, pp. 253–264.
Accuracy Evalua-on
9
Job Comple-on Time Accuracy Individual Flow Comple-on Time Accuracy
Nutch applica-on with background traffic
Sort applica-on with background traffic
Par--on skew problem
Impact of the network topology
Other experiments
Conclusion and Future Work • MRemu, an emula-on-‐based framework that enables conduc-ng datacenter network research – without requiring expensive and con-nuous access to large-‐scale datacenter hardware resources
• Available as open source: – hWps://github.com/mvneves/mremu
• Future work: – Extend it to other frameworks and traffic paWerns – Integrate it with Mininet-‐HiFI cluster edi-on – Support to migra-on of “virtual machines”
References • NEVES, M. V.,: Applica-on-‐aware networking to Accelerate
MapReduce Applica-ons (Ph.D. Disserta-on), 2015
• NEVES, M. V.; KATRINIS, M. K.; FRANKE, H.; DE ROSE, C. A. F.; Pythia: Faster Big Data in Mo-on through Predic-ve SoZware-‐Defined Network Op-miza-on at Run-me. In: IPDPS 2014, Phoenix, USA, 2014
• NEVES, M. V., DE ROSE, C. A. F., KATRINIS, K. MRemu: An
Emula-on-‐based Framework for Datacenter Network Experimenta-on using Realis-c MapReduce Traffic, MASCOTS 2015, Atlanta, USA, 2015.
MRemu: An Emula-on-‐based Framework for Datacenter Network Experimenta-on using Realis-c
MapReduce Traffic
Marcelo Veiga Neves1, Cesar A. F. De Rose1, Kostas Katrinis2
1 PUCRS, Porto Alegre, Brazil 2 IBM Research, Dublin, Ireland
Oct, 2015