21
© Hortonworks Inc. 2013 Top 10 things to get the most out of your Hadoop Cluster Suresh Srinivas | @suresh_m_s Sanjay Radia | @srr Page 1

Top Ten things to get the most out of your Hadoop cluster

Embed Size (px)

DESCRIPTION

This talk describes top ten things that make it easier to run and manage your Hadoop system in production. We start with configurations, best practices in planning and setting up Hadoop clusters for reliability and efficiency. We include typical machine sizing and the tradeoffs of big vs small servers relative to cluster size. We cover how to implement a cluster for multi-tenancy with an eye on isolation and sharing cluster resources. Next we describe the tools available for managing the cluster, such as decommissioning, balancer, and metrics. We include best practices for monitoring a cluster and dealing with different kinds of failures. In particular we emphasise differences from traditional data center server management especially when dealing with failures of disks and nodes. We go over how to use the tools available for backup, Disaster Recovery and Archiving. We concluded with how to cope with storage and computation growth that Hadoop production clusters typically see. These lessons and tips have been derived from our extensive experience in running production Hadoop clusters and supporting customers over the last six years. We share anecdotes and real life incidents throughout the talk.

Citation preview

Page 1: Top Ten things to get the most out of your Hadoop cluster

© Hortonworks Inc. 2013

Top 10 things to get the most out of your Hadoop ClusterSuresh Srinivas | @suresh_m_sSanjay Radia | @srr

Page 1

Page 2: Top Ten things to get the most out of your Hadoop cluster

© Hortonworks Inc. 2013

About Me

• Architect & Founder at Hortonworks

• Long time Apache Hadoop committer and PMC

member

• Designed and developed many key Hadoop features

• Experience from supporting many clusters– Including some of the world’s largest Hadoop clusters

Page 2

Page 3: Top Ten things to get the most out of your Hadoop cluster

© Hortonworks Inc. 2013

Agenda

Best Practices, Tips and Tricks for• Building cluster

• Configuration

• Monitoring

• Reliability

• Multi-tenancy

Page 3

Page 4: Top Ten things to get the most out of your Hadoop cluster

© Hortonworks Inc. 2013

Hardware and Cluster Sizing

• Considerations–Larger clusters heal faster on nodes or disk failure–Machines with huge storage take longer to recover–More racks give more failure domains

• Recommendations– Get good-quality commodity hardware– Buy the sweet-spot in pricing: 3TB disk, 96GB, 8-12 cores

– More memory is better – real time is memory hungry!

– Before considering fatter machines (1U 6 disks vs. 2U 12 disks)– Get to 30-40 machines or 3-4 racks

–Use pilot cluster to learn about load patterns– Balanced hardware for I/O, compute or memory bound

–Rule of thumb – network to compute cost of 20%–More details - http://tinyurl.com/hwx-hadoop-hw

Page 4

Page 5: Top Ten things to get the most out of your Hadoop cluster

© Hortonworks Inc. 2013

Configuration is Key

• Avoid JVM issues–Use 64 bit JVM for all daemons

– Compressed OOPS enabled by default (6 u23 and later)

– Java heap size– Set same max and starting heapsize, Xmx == Xms– Avoid java defaults – configure NewSize and MaxNewSize

– Use 1/8 to 1/6 of max size for JVMs larger than 4G

–Use low-latency GC collector– -XX:+UseConcMarkSweepGC, -XX:ParallelGCThreads=<N>

– High <N> on Namenode and JobTracker

– Important JVM configs to help debugging– -verbose:gc -Xloggc:<file> -XX:+PrintGCDetails – -XX:ErrorFile=<file>– -XX:+HeapDumpOnOutOfMemoryError

Page 5

Page 6: Top Ten things to get the most out of your Hadoop cluster

© Hortonworks Inc. 2013

Configuration is Key…

• Multiple redundant dirs for namenode metadata–One of dfs.name.dir should be on NFS–NFS softmount - tcp,soft,intr,timeo=20,retrans=5

• Configure open fd ulimit–Default 1024 is too low–16K for datanodes, 64K for Master nodes

• Setup cluster nodes with time synchronization• Use version control for configuration!

Page 6

Page 7: Top Ten things to get the most out of your Hadoop cluster

© Hortonworks Inc. 2013

Configuration is Key…

• Use disk fail in place for datanodes–Disk failure is no longer datanode failure–Especially important for large density nodes

• Set dfs.namenode.name.dir.restore to true–Restores NN storage directory during checkpointing

• Take periodic backups of namenode metadata–Make copies of the entire storage directory

• Master node OS device should be highly available–RAID-1 (mirrored pair)

• Set aside a lot of disk space for NN logs– It is verbose – set aside multiple GBs–Many installs configure this too small

– NN logs roll with in minutes – hard to debug issues

Page 7

Page 8: Top Ten things to get the most out of your Hadoop cluster

© Hortonworks Inc. 2013

Checkpointing

• Secondary Namenode - confusing name

Page 8

Page 9: Top Ten things to get the most out of your Hadoop cluster

© Hortonworks Inc. 2013

Checkpointing…

• Setup a single secondary namenode–Periodically merges file system image with journal–Two secondary namenodes not supported

– Many instances of accidental two secondary namenodes– Known to cause metadata corruption!

• In HA setup standby replaces secondary• Ensure periodic checkpoints are happening

–Checkpoint time can be queried in scripts– Shown in NN webUI as well

–Real incident– A cluster was run for more than a year with no checkpoint!– Namenode stopped when it ran out of disk space

– NN was running for more than an year – no restart!!!

– Restoring the cluster was not fun!

Page 9

Page 10: Top Ten things to get the most out of your Hadoop cluster

© Hortonworks Inc. 2013

Don’t edit the metadata files!

• Editing can corrupt the cluster state–Might result in loss of data

• Real incident–NN misconfigured to point to another NN’s metadata–DNs can’t register due to namespace ID mismatch

– System detected the problem correctly– Safety net ignored by the admin!

–Admin edits the namenode VERSION file to match ids

What Happens Next?

Page 10

Page 11: Top Ten things to get the most out of your Hadoop cluster

© Hortonworks Inc. 2013

Guard Against Accidental Deletion

• rm –r deletes the data at the speed of Hadoop!– ctrl-c of the command does not stop deletion!–Undeleting files on datanodes is hard & time consuming

– Immediately shutdown NN, unmount disks on datanodes– Recover deleted files– Start namenode without the delete operation in edits

• Enable Trash• Real Incident

–Customer is running a distro of Hadoop with trash not enabled–Deletes a large dir (100 TB) and shuts down NN immediately–Support person asks NN to be restarted to see if trash is enabled!

What happens next?• Now HDFS has Snapshots!

Page 11

Page 12: Top Ten things to get the most out of your Hadoop cluster

© Hortonworks Inc. 2013

Monitor Usage• Cluster storage, nodes, files, blocks grows

– Update NN heap, handler count, number of DN xceivers– Tweak other related config periodically

• Monitor the hardware usage for your work load– Disk I/O, network I/O, CPU and memory usage– Use this information when expanding cluster capacity

• Monitor the usage with HADOOP metrics– JVM metrics – GC times, Memory used, Thread Status– RPC metrics – especially latency to track slowdowns– HDFS metrics

– Used storage, # of files and blocks, total load on the cluster– File System operations

– MapReduce Metrics– Slot utilization and Job status

• Tweak configurations during upgrades/maintenance

Page 12

Page 13: Top Ten things to get the most out of your Hadoop cluster

© Hortonworks Inc. 2013

Monitoring Simplified With Ambari

Cluster Metrics Summary

Page 13

Page 14: Top Ten things to get the most out of your Hadoop cluster

© Hortonworks Inc. 2013

Monitoring Simplified With Ambari

HDFS Metrics Summary

Page 14

Page 15: Top Ten things to get the most out of your Hadoop cluster

© Hortonworks Inc. 2013

Monitoring Simplified With Ambari

MapReduce Metrics Summary

Page 15

Page 16: Top Ten things to get the most out of your Hadoop cluster

© Hortonworks Inc. 2013

Monitor Failures

• If a large % of datanodes fail put NN to safemode–Avoids unnecessary replication–Bring back the datanodes or rack

• Track dead datanodes–Bring back datanodes when the number grows

• Ensure cluster storage utilization is < 85%–When the cluster is nearly full things slow down

• Monitor for corrupt blocks–Delete tmp files with replication factor = 1 and missing blocks

• Have a portfolio of cluster validation tests/jobs–Run them on restart, upgrade & config changes

Page 16

Page 17: Top Ten things to get the most out of your Hadoop cluster

© Hortonworks Inc. 2013

Tools To Manage Clusters

• Use Balancer periodically–Distributes data and hence processing– Important to run after expanding the cluster–Use appropriate balancer bandwidth – does not need restart

– dfsadmin –setBalancerBandwidth <bandwidth>

• Decommissioning–Before removing/replacing DNs from the cluster

• Distcp for copying data to another cluster–Backup, Disaster recovery–More enhancements to come in the near future

• Tooling can be done around JMX/JMX http–See the list - http://<nn>/jmx?get=Hadoop:service=NameNode–All information equivalent to NN WebUI

Page 17

Page 18: Top Ten things to get the most out of your Hadoop cluster

© Hortonworks Inc. 2013

Further Simplify Management

• HDFS uses JBODs with replication, not RAID–Monitors nodes, disks, block checksums–Automatic Recovery - parallel – very fast

– Recovers entire 12TB node in 10s of minutes in a 100 node cluster

Compare with the cost & urgency of repairing a RAID 5!

• Spare cluster capacity further simplifies management–Nodes/clusters continue to run on failures, with lower capacity

– Nodes and disks can be fixed when convenient (unlike RAID)– Configure how many disk failures => node failure

–1 operator can manage 3-4K nodes

Page 18

Page 19: Top Ten things to get the most out of your Hadoop cluster

© Hortonworks Inc. 2013

Design For Multi-tenancy

• Share compute capacity with Capacity Scheduler – Queue(s) and sub-queues with a guaranteed capacity per tenant

–Almost like dedicated hardware–Better than private cluster –access to unused capacity –Resource limits for tasks

– Memory limits are monitored– C-groups just got into Yarn

– Resource isolation without VM overhead!

• Share HDFS Storage–Set quotas per-user and per-project data directories–Federation - Isolate categories of uses to separate namespaces

– Production vs. experimental, HBase etc.

Page 19

Page 20: Top Ten things to get the most out of your Hadoop cluster

© Hortonworks Inc. 2013

Train Users

• Train users on best practices on writing apps• Reduce storage use

–Delete unnecessary data periodically–Move cold data into Hadoop archive

• Encourage using replication >= 3 for important data–Hot data also needs higher replication

• Setup a small test cluster–Users test their code before moving to production–Avoid debugging in production cluster

• Setup user mailing list for information exchange• Encourage creating jiras in Apache

–Helps community identify issues, fix bugs, stabilize quickly

Page 20

Page 21: Top Ten things to get the most out of your Hadoop cluster

© Hortonworks Inc. 2013

Thank You – Q&A

Summary1. Choose suitable server hardware and cluster sizes

2. Configuration is key

3. Checkpointing

4. Don’t edit metadata files

5. Guard against accidental deletions

6. Monitor usage and failures

7. Use available tools for managing the cluster

8. Simplify management with spare capacity

9. Design for multi-tenancy

10. Train your users on best practices

Page 21