Upload
emulex-corporation
View
112
Download
3
Tags:
Embed Size (px)
DESCRIPTION
The Emulex Advanced Development Organization offers an in-depth analysis of how Emulex OneConnect Adapters quadruple the performance over 1GbE networks for Hadoop cluster environments, addressing the 'Big Data' performance needs of cloud providers and users. Traditional 1GbE networks have not kept pace with the growth of Big Data – Emulex offers an ideal solution.
Citation preview
Hadoop Webcast
Boosting Hadoop Performance with Emulex OneConnect® 10Gb Ethernet Adapters
2© 2011 Emulex Corporation
Agenda
Digital Content, today and tomorrow
What is Big Data?
Information as an Asset
A Solution to the Problem
The Moving Bottleneck
Hadoop on 10GbE
Testing Configurations and Objectives
Testing Results
Comparison Analysis – The Tale of the Tape
Q&A
3© 2011 Emulex Corporation
A Decade of Digital Universe Growth: Storage in Exabytes
Digital Content – Big Data’s Singularity
2005 2010 20150
2000
4000
6000
8000
10000
Sources of growth:
– Consumer participation– Photo and video archiving– eCommerce– Social media– Social networking– Mobile applications– Search engine indexing– Web logs– Medical records– Financial transactions– Scientific research– Surveillance
Source: IDC's Digital Universe Study, sponsored by EMC, June 2011
4© 2011 Emulex Corporation
What is Big Data?
Collections of data exceeding the capabilities of traditional database management tools…
– with dynamic, incremental data created around the data preceding it– scaling with advances in technology– from a growing number of sources
Think Big Bang theory…– but in the order of bytes
Spawning an entire ecosystem of new technologies and services
– Powerful– Dynamic– Scalable
5© 2011 Emulex Corporation
Tapping into Information as an Asset
Organizations actively analyze data rather than just store it
Actionable Data Competitive Differentiation Unlocking Value
BIG DATA
Increased Velocity Larger Volume Greater Variety
6© 2011 Emulex Corporation
A Solution to the Problem – Hadoop
A powerful, fault-tolerant, self-healing open source platform, allowing for the distributed computing on commodity clusters
Scaling to thousands of compute nodes, and efficiently managing petabytes of data
Leverages two key pieces of technology:– Hadoop Distributed File System (HDFS)– Hadoop MapReduce
Capable of being deployed alongside legacy
Enabling old and new data to be combined in powerful ways
Accessed by data intensive applications
Artem Gavrilov
Senior ArchitectAdvanced Development Organization
8© 2011 Emulex Corporation
Agenda
Digital Content, today and tomorrow
What is Big Data?
Information as an Asset
A Solution to the Problem
The Moving Bottleneck
Hadoop on 10GbE
Testing Configurations and Objectives
Testing Results
Comparison Analysis – The Tale of the Tape
Q&A
9© 2011 Emulex Corporation
The Moving Bottleneck in Hadoop Clusters
Designed to run on 1GbE performance characteristics– Ubiquity– Availability– Cost
Today’s commodity servers deliver astounding performance gains over their predecessors
Multi-core multi-threaded processors, fast DDR, and expanded memory space, faster and larger internal system drives have moved the bottleneck to the legacy 1GbE network
Performance characteristics available on today’s servers:– Processor (4 cores, 8 threads): 25.6GB/s max. memory bandwidth– PCIe 3.0 bus: 8GT/s bit rate– DDR4 memory modules: up to 3,200 MT/s– Storage: SSDs capable of 6Gb/s; SATA drivers capable of 600MB/s
10© 2011 Emulex Corporation
Hadoop Cluster Hardware – Then and Now
4 Processor Generations
DDR2 to DDR3 Transition
Higher Density Drives & SSDs
No Change – 1GbE
11© 2011 Emulex Corporation
Hadoop on 10GbE
Network I/O performance must scale with the increase in…– Processing power– Memory capacity– Storage performance
Network performance is essential to support larger and faster systems
Migrating from a 1GbE to a 10GbE network, leveraging Emulex OneConnect adapters resulted in a massive performance gain
12© 2011 Emulex Corporation
Hadoop workloads vary greatly– No “one size fits all” approach– 200+ cluster-wide and job-specific parameters that can be fine tuned
With the workload variety comes a disparity in the distribution of resource demands, which can be classed as:
Fine Tuning Hadoop
CPU Intensive– Machine learning– Complex data/text mining– Natural language processing– Feature extraction
I/O Intensive– Indexing– Searching– Grouping– Decoding/decompressing– Data importing/exporting
13© 2011 Emulex Corporation
The Setup
Servers:– HP ML350 G6
• Dual, Quad core Xeon 2GHz• 16 GB DDR3• Broadcom 1GbE BCM5715• Emulex OneConnect 10GbE
OCe11102 Ethernet Adapter
OS and Software:– Ubuntu 64 bit– Hadoop (Cloudera Distribution)
Storage:– SATA II 500GB 7200rpm Disk
Drives, 6 per node– HP Smart Array G6 RAID
Controller (JBOD - No RAID configured)
Cluster Configuration:– 15 servers with discrete roles
• 1 NameNode• 11 DataNodes• 3 Clients
– 1GbE and 10GbE Switches
14© 2011 Emulex Corporation
The Setup
Client 1
Client 2
Client 3
10Gb Switch
1Gb Switch
NameNode
DataNode 11
DataNode 2
DataNode 1
15© 2011 Emulex Corporation
Test Objective
Measure HDFS throughput ingesting data into a Hadoop cluster– Examining multiple client configurations– Raising HDFS ‘put’ operations per client– Transferring a constant 5GB file– Replication factor set to three– Duplicated for 1GbE and 10GbE Networks
Clients
DataNodes
‘Put’ Operations
Total Operations
1
11
1, 2, 4, 6, 8
1, 2, 4, 6, 8
2
11
1, 2, 4, 6, 8
2, 4, 8, 12, 16
3
11
1, 2, 4, 6, 8
3, 6, 12, 18, 24
16© 2011 Emulex Corporation
Data Import – Single Client, Single ‘Put’ Operation
Test Results – Legacy 1GbE
A single client, running a single operation makes maximal use of the network
HDFS efficiently transfers data to DataNodes within the cluster, averaging 108MBps out of the client server
0 4 8 12 16 20 24 28 32 36 40 44 48 52 560
200
400
600
800
1000
1 Operation
Time (sec)
MB
ps
17© 2011 Emulex Corporation
Data Import – Single Client, Multiple ‘Put’ Operations
Test Results – Legacy 1GbE
When more than one ‘put’ operation runs on a client, the 1GbE network becomes the bottleneck
Increasing the number of operations did not increase client throughput – restricted by the network connection
0 4 8 12 16 20 24 28 32 36 40 44 48 52 560
200
400
600
800
1000
1 Operation 4 Operations 8 Operations
Time (sec)
MB
ps
18© 2011 Emulex Corporation
Data Import – Multiple Clients, Multiple ‘Put’ Operations
Test Results – Legacy 1GbE
Expected to observe throughput scale with additional clients
Combined In and Out traffic averaged 225MBps
0 4 8 12 16 20 24 28 32 36 40 44 48 52 560
200
400
600
800
1000
1 Operation 4 Operations 8 Operations
Time (sec)
MB
ps
19© 2011 Emulex Corporation
Data Import – Multiple Clients, Multiple ‘Put’ Operations
Test Results – Legacy 1GbE
0 4 8 12 16 20 24 28 32 36 40 44 48 52 560
200
400
600
800
1000
1 Operation 4 Operations 8 Operations
Time (sec)
MB
ps
As network load increases
1GbE quickly reaches saturation
becomes the system bottleneck
20© 2011 Emulex Corporation
Data Import – Single Client, Single ‘Put’ Operation
Immediate performance improvement of 50% compared to 1GbE network
Data transfer completed in less than three quarters of the time
Test Results – Emulex OneConnect 10GbE
0 4 8 12 16 20 24 28 32 36 40 44 48 52 560
20
40
60
80
100
120
140
160
180
1GbE 10GbE
Time (sec)
MB
ps
21© 2011 Emulex Corporation
Data Import – Single Client, Multiple ‘Put’ Operations
Increased network load is met with increased throughput
Achieved transfer rates of 800MBps, nearly 8X the observed throughput of the 1GbE configuration
Test Results – Emulex OneConnect 10GbE
0 5 10 15 20 25 30 35 40 45 50 55 60 65 70 75 800
200
400
600
800
1000
1 Operation 4 Operations 8 Operations
Time (sec)
MB
ps
22© 2011 Emulex Corporation
Data Import – Multiple Clients, Multiple ‘Put’ Operations
Throughput scales with additional clients being brought on-line
The 10GbE network does not limit transfer rates as the clients and their operations increase
Test Results – Emulex OneConnect 10GbE
0 9 18 27 36 45 54 63 72 81 90 99 1081171261351441530
200
400
600
800
1000
1200
1400
1600
1800
1 Operation 4 Operations 8 Operations
Time (sec)
MB
ps
23© 2011 Emulex Corporation
Maximum Throughput Achieved
Tale of the Tape – 1GbE vs 10GbE
1 27 53 79 1051311571832092352612873133393653914174430
200
400
600
800
1000
1200
1400
1600
1800
1G 10G
Time (sec)
MB
ps
1GbE Max MBps
10GbE Max MBps
Data Size
Clients
DataNodes
‘Put’ Operations
Total Operations
3
11
6
18
270GB
250
1,674 (6.7X faster)
24© 2011 Emulex Corporation
Average Throughput Achieved
Tale of the Tape – 1GbE vs 10GbE
1GbE Avg MBps
10GbE Avg MBps
Data Size
Clients
DataNodes
‘Put’ Operations
Total Operations
3
11
6
18
270GB
216
831 (3.85X faster)
1 2 4 8 12 180
200
400
600
800
1000
1G 10G
Number of 'put' operations
MB
ps
~4X throughput enables more efficient real time analysis
25© 2011 Emulex Corporation
Time to Completion (seconds)
Tale of the Tape – 1GbE vs 10GbE
1GbE Completion
10GbE Completion
Data Size
Clients
DataNodes
‘Put’ Operations
Total Operations
3
11
6
18
270GB
453
115 (3.94X faster)
1 2 4 8 12 180
100
200
300
400
500
600
1G 10G
Numer of 'put' operations
Tim
e (s
ec)
Load times reduced by 75% improving batch analysis
26© 2011 Emulex Corporation
Key Takeaways
Hadoop runs faster with 10G– Up to 8 times faster in some scenarios
Fine tuning parameters is important for performance– Improvements may not be possible without proper configuration
Future performance gains are possible– Hadoop was designed for 1GbE, but small changes will enable the full
potential of 10GbE
Hadoop is better with Emulex OneConnect Ethernet Adapters– “It just works” – right out of the box– Leverage our expertise to configure your Hadoop installation for
maximum performance
Questions
28© 2011 Emulex Corporation
Questions
Which 1GbE and 10GbE switches were included in our tests? And would we see better performance with a switch that had lower latency?
We used several different models of Cisco switches – each with different latency attributes. We found that latency didn’t impact throughput performance in a significant way. In one case, when moving to a switch with double the latency performance, we only witnessed roughly 1% increase in the throughput performance. Within the construct of our tests, we did not find that latency was critical to the performance results.
29© 2011 Emulex Corporation
Questions
Did we find the network being the bottleneck prior to the disk subsystem becoming the bottleneck?
Yes, and it comes through in our graphs. It’s important to note that at the beginning of our tests, we encountered some disk performance bottlenecks due to some configuration issues. Proving that it is essential to understand the configuration settings for your Hadoop cluster in order to tap the full potential of your disks. With commodity disks, the standard performance characteristics is 100MBps per disk, typical environments have 6 disks per node, totaling 600MBps in performance potential. In some cases, you don’t need disk operations to actually happen – data is moved from memory to memory, but in most cases, data is moved from disk to disk on different machines. In those cases, disk performance is important. However, in our test cases, the disk performance was not a bottleneck.
30© 2011 Emulex Corporation
Questions
How many 1GbE NICs were used? Were multiple 1GbE NICs bridged together, or just a single 10GbE NIC?
Our configuration used a single 1GbE NIC with two ports. Which is the typical commodity server configuration. Theoretically, you can install multiple cards, and get better performance, but it is a more difficult proposition, and would cost more than a single 10GbE NIC, aside from the fact that there likely would not be enough slots on the motherboard to accommodate that many cards.
31© 2011 Emulex Corporation
Questions
What is the maximum throughput of 10GbE?
10GbE maximum throughput is 1.25GB/s for single direction data transfer. When aggregated with receiving data, 2.5GB/s is the maximum. Hadoop is not designed to accommodate this speed, yet. Hopefully, it will be there soon. It’s important to mention that most 10GbE solutions today come with two ports, which means that you can achieve up to 5GB/s performance. Of course, in order to leverage that performance, you have to have a disk sub-system that operates close to that level. We observed that in cases where two 10GbE ports were used, you have 12 high performance disks. Today, it is not necessary because Hadoop does not use the network efficiently, so even with 6 disks, you will see a significant performance gain.
32© 2011 Emulex Corporation
Questions
Do we have a list of the parameters that need to be tuned within Hadoop in order to maximize the performance of our 10GbE NICs?
The settings will vary depending on the environment. There isn’t a one-size-fits-all approach. Some of these parameters have been published in our white paper, and we will review that paper to ensure that all of those parameters are addressed.
33© 2011 Emulex Corporation
Questions
Are these results comparable to other 10GbE NICs or is this something unique to the Emulex technology portfolio?
We included multiple cards from our competitors in this research project. Emulex cards did offer performance advantages over our competition – approximately 10%. The important observation was that competitors cards were more prone to failures – servers stopped responding, system reboots needed, etc. Emulex cards were far more reliable across the board, which we believe is more important than fractional performance gains.
34© 2011 Emulex Corporation
Questions
If the tests did not saturated the bandwidth of a 1GbE link, is the cause of the performance increase with 10GbE attributable to the “bursty” nature of the transfer itself?
Hadoop is not optimized for networking, which is why there are some odd observations from time to time. There are times when even on 1GbE connections, it’s possible to not reach 50% of maximum throughput – a by product of its design. Hadoop was designed to run multiple jobs and operations, and in those instances these performance issues do not manifest themselves.
35© 2011 Emulex Corporation
Questions
Would a round-robin bonding configuration be possible with 10GbE, and would there be a performance gain from that?
Theoretically, it is possible. Practically, it is unlikely due to the underlying disk system becoming the bottleneck (for the moment). If there are SSDs, or more than 6 disks being used, there is potential for performance improvement.
36© 2011 Emulex Corporation
Questions
Have we run tests with SSDs, higher RPM spindles, or larger spindle configurations?
Yes, we did. And we encountered some interesting results. While we did see improvements of approximately 40%, we anticipated much better results with SSDs. The biggest issue with SSDs has to do with the way Hadoop interfaces with them – it does not tap into the full potential of the disk. Ultimately, we landed on throughput being the most important factor for performance, not necessarily I/O.
37© 2011 Emulex Corporation
Thank You…