25
NoSQL Now! Aug 21, 2013 Ben Wen, Joyent Renat Khasanshyn, Altoros

How to Increase Performance of Your Hadoop Cluster

  • Upload
    altoros

  • View
    2.970

  • Download
    4

Embed Size (px)

DESCRIPTION

A presentation made by Altoros and Joyent together at the NoSQL Now! 2013 conference.

Citation preview

Page 1: How to Increase Performance of Your Hadoop Cluster

NoSQL Now!

Aug 21, 2013

Ben Wen, Joyent

Renat Khasanshyn, Altoros

Page 2: How to Increase Performance of Your Hadoop Cluster

About Joyent

The high-performance public cloud

infrastructure provider

Cloud IaaS Virtual Machines:

Linux, Windows, BSD, SmartOS

(fka Solaris) with Zones

Core founding sponsors of Node.js

Four global datacenters

Key markets:

Big data, mobile, e-commerce,

finsvc, SaaS

Open Source contributions:

Node.js, KVM, DTrace, ZFS,

SmartOS

Page 3: How to Increase Performance of Your Hadoop Cluster

4

Running bare-metal only practical for some organizations

Performance varies significantly across various job types

In fact, for many jobs, less = more

Utilization of most clusters in production is low

Optimizing Hadoop/MapReduce performance is hard

Page 4: How to Increase Performance of Your Hadoop Cluster

5

Get upset when truth comes out!

Biased (to the shiny side of the coin)

Often add controversy and confusion

Page 5: How to Increase Performance of Your Hadoop Cluster

6

- For Hadoop, what is the impact of Container-based virtualization vs Hardware

emulation (KVM)*

- What are the Hadoop optimization strategies? Is there a “rule of thumb” when it

comes to determining the optimization approach?

- What are the optimal Hadoop cluster settings for 1TB TeraSort benchmark on

100 and 400 node clusters running Linux and SmartOS on the Joyent Public

Cloud?

Page 6: How to Increase Performance of Your Hadoop Cluster

7

Physical (disks, cpu, network)

OS/Hypervisor (especially for virtualized environments)

Hadoop/MapReduce (tons of settings)

Algorithmic (data structures, join strategies, big-O…)

Implementation (code efficiency, architecture decisions that fit all other factors)

Page 7: How to Increase Performance of Your Hadoop Cluster

8

Open source Unix operating system based on the active fork of Open Solaris technology (illumos) for the cloud. Uses containerized OS virtualization, called Zones (think a mature LXC with secure RBAC and auditing)

operating system based on the Debian

Linux distribution and distributed as free

and open source software.

Apache Hadoop is an open-source software framework that supports data-intensive distributed applications, licensed under the Apache v2 license. Derived from Google's MapReduce and Google File System (GFS) papers, Hadoop enables applications to work with thousands of computation-independent computers and petabytes of data.

Page 8: How to Increase Performance of Your Hadoop Cluster

9

Written by Opscode and released as open source under the Apache License 2.0., Chef is a DevOps tool used for configuring cloud services or to streamline the task of configuring a company's internal servers. Chef automatically sets up and tweaks the operating systems and programs that run in massive data centers.

Developed by creators of the Starfish project from Duke University, Unravel brings run-time profiling of Hadoop jobs followed by a cost-based database query optimization. Unravel connects to streams of Hadoop and system instrumentation data, and applies statistical machine learning to optimize cost of Hadoop jobs and increase cluster utilization.

Page 9: How to Increase Performance of Your Hadoop Cluster

1

0

Comparing I/O Path on

Bare Metal Unix Vs Zones Vs KVM

• Code path is essentially the same as bare metal

• Zones partition at the OS level

• Performance is higher

• KVM is encapsulated by hypervisor

• Code path is much more circuitous in a KVM process.

• Performance is impacted

Bare-metal OS Virtualization Kernel Virtualization

Page 10: How to Increase Performance of Your Hadoop Cluster

1

1

No over

head for

Zones:

Stack traces

show how a

network

packet is

transmitted

from:

Bare Metal

vs

Joyent Zone

vs

Fedora VM

on KVM

Bare Metal Joyent Zone (aka SmartMachine) Fedora VM on KVM VM

Start Start Start

1 kernel`start_xmit

2 kernel`dtrace_int3_handler+0xd2

3 kernel`kmem_cache_free+0x2f

4 kernel`dtrace_int3+0x3a

5 kernel`eth_header

6 kernel`__kfree_skb+0x47

7 kernel`start_xmit+0x1

8 kernel`dev_hard_start_xmit+0x322

9 kernel`sch_direct_xmit+0xef

10 kernel`dev_queue_xmit+0x184

11 kernel`eth_header+0x3a

12 kernel`neigh_resolve_output+0x11e

13 kernel`nf_hook_slow+0x75

14 kernel`ip_finish_output

15 kernel`ip_finish_output+0x17e

16 kernel`ip_output+0x98

17 kernel`__ip_local_out+0xa4

18 kernel`ip_local_out+0x29

19 kernel`ip_queue_xmit+0x14f

20 kernel`tcp_transmit_skb+0x3e4

21 kernel`__kmalloc_node_track_caller+0x185

22 kernel`sk_stream_alloc_skb+0x41

23 kernel`tcp_write_xmit+0xf7

24 kernel`__alloc_skb+0x8c

25 kernel`__tcp_push_pending_frames+0x26

26 kernel`tcp_sendmsg+0x895

27 kernel`inet_sendmsg+0x64

28 kernel`sock_aio_write+0x13a

29 kernel`do_sync_write+0xd2

30 kernel`security_file_permission+0x2c

31 kernel`rw_verify_area+0x61

32 kernel`vfs_write+0x16d

33 kernel`sys_write+0x4a

34 kernel`sys_rt_sigprocmask+0x84

35 kernel`system_call_fastpath+0x16

36 igb`igb_tx_ring_send+0x33

37 mac`mac_hwring_tx+0x1d

38 mac`mac_tx_send+0x5dc

39 mac`mac_tx_single_ring_mode+0x6e

mac`mac_tx+0xda mac`mac_tx+0xda mac`mac_tx+0xda

dld`str_mdata_fastpath_put+0x53 dld`str_mdata_fastpath_put+0x53 dld`str_mdata_fastpath_put+0x53

ip`ip_xmit+0x82d ip`ip_xmit+0x82d ip`ip_xmit+0x82d

ip`ire_send_wire_v4+0x3e9 ip`ire_send_wire_v4+0x3e9 ip`ire_send_wire_v4+0x3e9

ip`conn_ip_output+0x190 ip`conn_ip_output+0x190 ip`conn_ip_output+0x190

ip`tcp_send_data+0x59 ip`tcp_send_data+0x59 ip`tcp_send_data+0x59

ip`tcp_output+0x58c ip`tcp_output+0x58c ip`tcp_output+0x58c

ip`squeue_enter+0x426 ip`squeue_enter+0x426 ip`squeue_enter+0x426

ip`tcp_sendmsg+0x14f ip`tcp_sendmsg+0x14f ip`tcp_sendmsg+0x14f

sockfs`so_sendmsg+0x26b sockfs`so_sendmsg+0x26b sockfs`so_sendmsg+0x26b

sockfs`socket_sendmsg+0x48 sockfs`socket_sendmsg+0x48 sockfs`socket_sendmsg+0x48

sockfs`socket_vop_write+0x6c sockfs`socket_vop_write+0x6c sockfs`socket_vop_write+0x6c

genunix`fop_write+0x8b genunix`fop_write+0x8b genunix`fop_write+0x8b

genunix`write+0x250 genunix`write+0x250 genunix`write+0x250

genunix`write32+0x1e genunix`write32+0x1e genunix`write32+0x1e

unix`_sys_sysenter_post_swapgs+0x14 unix`_sys_sysenter_post_swapgs+0x14 unix`_sys_sysenter_post_swapgs+0x149

Skips steppingthrough39 functionsrequiredwhen Fedorais running onKVM/qemu

Note thata Joyent Zoneis exactly thesame as “BareMetal”

Page 11: How to Increase Performance of Your Hadoop Cluster

Three identical Apache Hadoop 1.0.4 clusters were provisioned on Joyent

infrastructure using Joyent REST API and Opscode Chef

Each cluster was tweaked for optimal performance following best practices for

TeraSort benchmark.

Page 12: How to Increase Performance of Your Hadoop Cluster

13

A custom script launches virtual machines using Joyent API and stores information

about them in a json file.

Page 13: How to Increase Performance of Your Hadoop Cluster

14

Each machine in cluster is being configured according to its role in cluster using

Chef cookbooks.

Page 14: How to Increase Performance of Your Hadoop Cluster

15

As part of TeraSort benchmark a dataset is generated using TeraGen utility

included in Apache Hadoop.

Page 15: How to Increase Performance of Your Hadoop Cluster

16

On one of the nodes a Hadoop TeraSort job using previously generated dataset is

submitted.

Page 16: How to Increase Performance of Your Hadoop Cluster

17

See: Hadoop job_201210261134_0010 on hadoop-smartos-r-1.html

The key difference between the two clusters was unveiled when monitoring I/O and

CPU utilization. Ubuntu cluster was spending too much time in OS kernel while

performing I/O operations as demonstrated on Figure 1.

Page 17: How to Increase Performance of Your Hadoop Cluster

SmartOS cluster was using CPU much more efficiently and was able to utilize larger

number of Hadoop mappers and reducers, key configuration parameters for Hadoop:

Page 18: How to Increase Performance of Your Hadoop Cluster
Page 19: How to Increase Performance of Your Hadoop Cluster

20

Page 20: How to Increase Performance of Your Hadoop Cluster

21

Page 21: How to Increase Performance of Your Hadoop Cluster

22

Page 22: How to Increase Performance of Your Hadoop Cluster

The key difference between the clusters was unveiled when monitoring I/O and CPU utilization. Ubuntu cluster was spending too much time in OS kernel while performing I/O (for copies of configfiles and job reports –email [email protected])

Page 23: How to Increase Performance of Your Hadoop Cluster

24

1) Basic cluster configuration is key (one time effort for typical workloads)

DATA DISK SCALING

COMPRESSION

JVM REUSE POLICY

HDFS BLOCK SIZE

MAP-SIDE SPILLS

COPY/SHUFFLE PHASE TUNING

REDUCE-SIDE SPILLS

2) Tune the number of map and reduce tasks appropriately

3) Consider GPU for some workloads

Page 24: How to Increase Performance of Your Hadoop Cluster

25

• Forthcoming in October

• Includes cloud performance

• Co-author DTrace book

• More here on his techniques:

• http://dtrace.org/blogs/brendan/

Page 25: How to Increase Performance of Your Hadoop Cluster

26

Thank you!

Ben Wen: [email protected]

Renat Khasanshyn: [email protected]

@renatco (650) 395-7002