28
1 © Copyright 2013 EMC Corporation. All rights reserved. Operationalizing 1000 Node Hadoop Cluster – Analytics Workbench Clinton Ooi Bhavin Modi

Pivotal: Operationalizing 1000 Node Hadoop Cluster - Analytics Workbench

Embed Size (px)

DESCRIPTION

Pivotal has setup and operationalized 1000 node Hadoop cluster called the Analytics Workbench. It takes special setup and skills to manage such a large deployment. This session shares how we set it up and how you will manage it.

Citation preview

Page 1: Pivotal: Operationalizing 1000 Node Hadoop Cluster - Analytics Workbench

1 © Copyright 2013 EMC Corporation. All rights reserved.

Operationalizing 1000 Node Hadoop Cluster – Analytics Workbench Clinton Ooi Bhavin Modi

Page 2: Pivotal: Operationalizing 1000 Node Hadoop Cluster - Analytics Workbench

2 © Copyright 2013 EMC Corporation. All rights reserved.

Agenda

Introduction

Tools – Kickstart – Parallel SSH – Puppet

Q & A

Page 3: Pivotal: Operationalizing 1000 Node Hadoop Cluster - Analytics Workbench

3 © Copyright 2013 EMC Corporation. All rights reserved.

Meet AWB Introduction to the Analytics Workbench

Page 4: Pivotal: Operationalizing 1000 Node Hadoop Cluster - Analytics Workbench

4 © Copyright 2013 EMC Corporation. All rights reserved.

Vision Statement Provide a collaborative platform that is:

AGILE: Support platform for proving mixed mode enterprise readiness at scale. INNOVATIVE: Showcase ground breaking data science. ACCESSIBLE: Create a shared environment for rapid innovation of big data and cloud computing technologies. EDUCATIONAL: Provide a resource for educating developers, partners, and customers on big data and cloud technologies.

Page 5: Pivotal: Operationalizing 1000 Node Hadoop Cluster - Analytics Workbench

5 © Copyright 2013 EMC Corporation. All rights reserved.

Partners Intel– contributed 2,000 hex-core CPUs

Mellanox – contributed 72 switches, 1000+ network cards, 1400+ cables

Micron – contributed 6,000 memory modules

Seagate – contributed 12,000 2TB drives

Supermicro – contributed 1,000+ servers

Switch – contributed the hosting facility in its state-of-the-art data center

VMware – provided operational support

Page 6: Pivotal: Operationalizing 1000 Node Hadoop Cluster - Analytics Workbench

6 © Copyright 2013 EMC Corporation. All rights reserved.

Quick facts Largest Hadoop cluster of its kind

Operational since July 2012

Single multi-tenant cluster

Physical cluster (no virtualization)

25 projects - 12 active, 8 in pipeline

Page 7: Pivotal: Operationalizing 1000 Node Hadoop Cluster - Analytics Workbench

7 © Copyright 2013 EMC Corporation. All rights reserved.

Use-case Pivotal Demonstration

Partner Engagements

Industry and Academia Collaboration

Page 8: Pivotal: Operationalizing 1000 Node Hadoop Cluster - Analytics Workbench

8 © Copyright 2013 EMC Corporation. All rights reserved.

Tools Scalable Tool Chain & Standardization

Page 9: Pivotal: Operationalizing 1000 Node Hadoop Cluster - Analytics Workbench

9 © Copyright 2013 EMC Corporation. All rights reserved.

AWB Cluster Lifecycle

Page 10: Pivotal: Operationalizing 1000 Node Hadoop Cluster - Analytics Workbench

10 © Copyright 2013 EMC Corporation. All rights reserved.

AWB Cluster Lifecycle

Page 11: Pivotal: Operationalizing 1000 Node Hadoop Cluster - Analytics Workbench

11 © Copyright 2013 EMC Corporation. All rights reserved.

Kickstart Generic tool to automate OS install

Requires DHCP, TFTP and HTTP services

TFTP serves the PXELINUX HEX file, Linux kernel (vmlinuz) and in-memory file system (initrd)

HTTP serves the kickstart configuration (kickstart.cfg)

Page 12: Pivotal: Operationalizing 1000 Node Hadoop Cluster - Analytics Workbench

12 © Copyright 2013 EMC Corporation. All rights reserved.

Kickstart

Example of PXELINUX file - /tftpboot/pxelinux.cfg/AC1C0401

Continued

default install label install kernel centos/6.2/vmlinuz append initrd=centos/6.2/initrd.img ramdisk_size=9025 text console=ttyS2,115200,n,1 sshd=1 install=http://10.1.25.51/centos/6.2/os/x86_64 ks=http://10.1.25.51/centos/6.2/kickstart/conf/kickstart.cfg implicit 1 display message prompt 1 timeout 10

Page 13: Pivotal: Operationalizing 1000 Node Hadoop Cluster - Analytics Workbench

13 © Copyright 2013 EMC Corporation. All rights reserved.

Kickstart

Example of kickstart config

Continued

… url --url http://10.1.25.51/centos/6.2/os/x86_64 ... %packages @core @performance … %post --log=/root/kickstart-post.log wget -O /root/post-install.tgz http://10.1.25.51/centos/6.2/post-install.tgz …

Page 14: Pivotal: Operationalizing 1000 Node Hadoop Cluster - Analytics Workbench

14 © Copyright 2013 EMC Corporation. All rights reserved.

Kickstart

Generate PXELINUX and kickstart files

Continued

[cooi@ks ~]$ ./kickstart --generate --os centos --osver 6.2 --restart pxe node0945 Generating /tftpboot/pxelinux.cfg/AC1C0401 Setting bootdev on node0945.sp Set Boot Device to pxe Restarting node0945.sp Chassis Power Control: Cycle

[cooi@ks ~]$ for i in `seq -w 1 200`; do ./kickstart --generate --os centos --osver 6.2 --restart pxe node0$i; done … Skipping

Page 15: Pivotal: Operationalizing 1000 Node Hadoop Cluster - Analytics Workbench

15 © Copyright 2013 EMC Corporation. All rights reserved.

Kickstart

Enable switching or upgrading OS easily

Kickstart 60 nodes in ~45 minutes: – 1 kickstart server with software RAID5 – 100Mbps TOR and aggregator switches – Saturated the 100Mbps network

Kickstart 200 nodes in ~45 minutes: – 2 kickstart servers with software RAID5 – 100Mbps TOR switches and 1Gbps aggregator switches

Estimate to do >1000 nodes with full 1Gbps network

Continued

Page 16: Pivotal: Operationalizing 1000 Node Hadoop Cluster - Analytics Workbench

16 © Copyright 2013 EMC Corporation. All rights reserved.

Parallel SSH

Sys admin’s lightsaber

Page 17: Pivotal: Operationalizing 1000 Node Hadoop Cluster - Analytics Workbench

17 © Copyright 2013 EMC Corporation. All rights reserved.

Parallel SSH Continued

Start/Stop Hadoop services

Orchestrate cluster deployments

Perform manual cluster administration tasks

Pick one that is user-friendly and scalable, e.g. – Massh - http://m.a.tt/er/massh/ – ClusterShell - https://github.com/cea-hpc/clustershell – Parallel Distributed Shell (pdsh) - https://code.google.com/p/pdsh

Page 18: Pivotal: Operationalizing 1000 Node Hadoop Cluster - Analytics Workbench

18 © Copyright 2013 EMC Corporation. All rights reserved.

Puppet Configuration Management framework

Install and configure all applications on the cluster

Configure monitoring system

Currently running Puppet 2.7.x

Page 19: Pivotal: Operationalizing 1000 Node Hadoop Cluster - Analytics Workbench

19 © Copyright 2013 EMC Corporation. All rights reserved.

Puppet Continued

Page 20: Pivotal: Operationalizing 1000 Node Hadoop Cluster - Analytics Workbench

20 © Copyright 2013 EMC Corporation. All rights reserved.

Puppet Continued

Page 21: Pivotal: Operationalizing 1000 Node Hadoop Cluster - Analytics Workbench

21 © Copyright 2013 EMC Corporation. All rights reserved.

Puppet Continued

Page 22: Pivotal: Operationalizing 1000 Node Hadoop Cluster - Analytics Workbench

22 © Copyright 2013 EMC Corporation. All rights reserved.

Puppet Continued

Page 23: Pivotal: Operationalizing 1000 Node Hadoop Cluster - Analytics Workbench

23 © Copyright 2013 EMC Corporation. All rights reserved.

Puppet Continued

Puppet sync 600 nodes in ~15 minutes: – Use parallel SSH tool to trigger Puppet sync across the cluster – 1 Puppet master with dual hex-core CPU – Saturated CPU on the Puppet master

Switch versions of Hadoop in 2 hours

Manifests and modules are version-controlled

Page 24: Pivotal: Operationalizing 1000 Node Hadoop Cluster - Analytics Workbench

24 © Copyright 2013 EMC Corporation. All rights reserved.

Puppet Continued

One quarter to learn, deploy and design our Puppet infrastructure.

– It is an iterative process.

Tasks managed outside of Puppet: – User account management – Start/Stop Hadoop services – Orchestrate deployment – Rollback/uninstall applications

Page 25: Pivotal: Operationalizing 1000 Node Hadoop Cluster - Analytics Workbench

25 © Copyright 2013 EMC Corporation. All rights reserved.

Cluster Management Tools

Task / Tools Kickstart Parallel SSH Puppet Nagios Ganglia

Install OS

Install Apps

Configure Apps

Start / Stop Services

Monitoring

Page 26: Pivotal: Operationalizing 1000 Node Hadoop Cluster - Analytics Workbench

26 © Copyright 2013 EMC Corporation. All rights reserved.

Q & A

http://www.analyticsworkbench.com

Page 27: Pivotal: Operationalizing 1000 Node Hadoop Cluster - Analytics Workbench

27 © Copyright 2013 EMC Corporation. All rights reserved.

Pivotal Sessions at EMC World Session Presenter Dates/Times The Pivotal Platform: A Purpose-Built Platform for Big-Data-Driven Applications

Josh Klahr Tue 5:30 - 6:30, Palazzo E Wed 11:30 - 12:30, Delfino 4005

Pivotal: Data Scientists on the Front Line: Examples of Data Science in Action

Noelle Sio Tue 10:00 - 11:00, Lando 4205 Thu 8:30 - 9:30, Palazzo F

Pivotal: Operationalizing 1000-node Hadoop Cluster – Analytics Workbench

Clinton Ooi Bhavin Modi

Tue 11:30 - 12:30, Palazzo L Thu 10:00- 11:00 am, Delfino 4001A

Pivotal: for Powerful Processing of Unstructured Data For Valuable Insights

SK Krishnamurthy

Mon 4:00 - 5:00, Lando 4201 A Tue 4:00 - 5:00, Palazzo M

Pivotal: Big & Fast data – merging real-time data and deep analytics

Michael Crutcher

Mon 1:00 - 2:00, Lando 4201 A Wed 10:00 - 11:00, Palazzo M

Pivotal: Virtualize Big Data to Make The Elephant Dance June Yang Dan Baskette

Mon 11:30 - 12:30, Marcello 4401A Wed 4:00 - 5:00, Palazzo E

Hadoop Design Patterns Don Miner Mon 2:30 - 3:30, Palazzo F Wed 8:30 - 9:30, Delfino 4005

Page 28: Pivotal: Operationalizing 1000 Node Hadoop Cluster - Analytics Workbench