Scaling Spark Workloads on YARN - Boulder/Denver July 2015

Page 1

July 2015

Scaling Spark Workloads on YARN

Boulder/Denver Big Data Shane Kumpf & Mac Moore Solu2ons Engineers, Hortonworks July 2015

Page 2

Agenda

§  Introduction – Why we love Spark, Spark Strategy, What’s Next

§ YARN: The Data Operating System § Spark: Processing Internals Review § Spark: on YARN § Demo: Scaling Spark on YARN in the cloud § Q & A

Page 2

Page 3

Made for Data Science"All apps need to get predictive at scale and fine granularity Democratizes Machine Learning"Spark is doing to ML on Hadoop what Hive did for SQL on Hadoop

Elegant Developer APIs"DataFrames, Machine Learning, and SQL

Realize Value of Data Operating System"A key tool in the Hadoop toolbox

Community"Broad developer, customer and partner interest

Why We Love Spark at Hortonworks

Storage

YARN: Data Operating System

Governance Security

Operations

Resource Management

Page 4

Hadoop/YARN Powered data operating system"100% open source, multi-tenant data platform for any application, any dataset, anywhere." Built on a centralized architecture of shared enterprise services •  Scalable Tiered Storage •  Resource and workload management •  Trusted data governance and metadata management •  Consistent operations •  Comprehensive security •  Developer APIs and tools

Data Operating System: Open Enterprise Hadoop

Page 5

Themes for Spark Strategy

Spark is made for Data Science •  Lead in the community for ML optimization • Data Science theme of Spark Summit / Hadoop Summit Provide Notebooks for data exploration & visualization •  iPython Ambari Stack • Zeppelin – we’re very excited about this project Process more Hadoop data efficiently in Spark • Hive/ORC data delivered, HBase work in progress Innovate at the core • Security, Spark on YARN improvements and more

Page 6

Current State of Security in Spark

Only Spark on YARN supports Kerberos today •  Leverage Kerberos for authentication

Spark reads data from HDFS & ORC •  HDFS file permissions (& Ranger integration) applicable to Spark jobs

Spark submits job to YARN queue •  YARN queue ACL (& Ranger integration) applicable to Spark jobs

Wire Encryption •  Spark has some coverage, not all channels are covered

LDAP Authentication •  No Authentication in Spark UI OOB, supports filter for hooking in LDAP

Page 7

What about ORC support?

ORC – Optimized Row Columnar format ORC is an Apache TLP providing columnar storage for Hadoop

Spark ORC Support •  ORC support in HDP/Spark since 1.2.x – (Alpha) •  ORC support merged into Apache Spark in 1.4

•  Joint blog with Databricks @ hortonworks.com •  Changes between ORC 1.3.1 and Spark 1.4.1

•  ORC now uses standard API to read/write.

orc.apache.org

Page 8

Introducing Apache Zeppelin…

Page 9

Apache Zeppelin

Features

•  A web-based notebook for interactive analytics

•  Ad-hoc experimentation with Spark, Hive, Shell, Flink, Tajo, Ignite, Lens, etc

•  Deeply integrated with Spark and Hadoop •  Can be managed via Ambari Stacks

•  Supports multiple language backends •  Pluggable “Interpreters”

•  Incubating at Apache •  100% open source and open community

Use Cases

•  Data exploration & discovery

•  Visualization - tables, graphs, charts

•  Interactive snippet-at-a-time experience

•  Collaboration and publishing

•  “Modern Data Science Studio”

Page 10

Where can I find more?

• Arun Murthy’s Keynote at Hadoop Summit & SparkSummit – Hadoop Summit (http://bit.ly/1IC1BEG) – Spark Summit (http://bit.ly/1M7qw47)

• DataScience with Spark & Zeppelin Session at Hadoop Summit – http://bit.ly/1DdKeTs

• DataScience with Spark + Zeppelin Blog – http://bit.ly/1HFd545

• ORC Support in Spark Blog – http://bit.ly/1OkA1uU

Page 11

YARN: The Data Operating System

2015

Page 12

YARN Introduction

The Architectural Center •  YARN moved Hadoop “beyond batch”; run batch, interactive,

and real time applications simultaneously on shared hardware. •  Intelligently places workloads on cluster members based on

resource requirements, labels, and data locality. •  Runs user code in containers, providing isolation and lifecycle

management.

Hortonworks Data PlaBorm 2.2

YARN: Data Operating System (Cluster Resource Management)

1 ° ° ° ° ° ° °

° ° ° ° ° ° ° °

Apa

che

Pig

° °

° °

° ° °

° ° °

HDFS (Hadoop Distributed File System)

GOVERNANCE BATCH, INTERACTIVE & REAL-TIME DATA ACCESS

Apache Falcon

Apa

che

Hiv

e C

asca

ding

A

pach

e H

Bas

e A

pach

e A

ccum

ulo

Apa

che

Sol

r A

pach

e S

park

Apa

che

Sto

rm

Apache Sqoop

Apache Flume

Apache Kafka

SECURITY

Apache Ranger

Apache Knox

Apache Falcon

OPERATIONS

Apache Ambari

Apache Zookeeper

Apache Oozie

Page 13

YARN Architecture - Overview

Resource Manager •  Global resource scheduler

Node Manager •  Per-machine agent •  Manages the life-cycle of container & resource

monitoring Container

•  Basic unit of allocation •  Fine-grained resource allocation across multiple

resource types (memory, cpu, future: disk, network, gpu, etc.)

Application Master •  Per-application master that manages application

scheduling and task execution •  E.g. MapReduce Application Master

Page 14

YARN Concepts

• Application – Application is a job or a long running service submitted to YARN – Examples:

–  Job: Map Reduce Job

–  Service: HBase Cluster

• Container – Basic unit of allocation

–  Map Reduce map or reduce task

–  HBase HMaster or Region Server

– Fine-grained resource allocations –  container_0 = 2GB, 1CPU

–  container_1 = 1GB, 6 CPU

– Replaces the fixed map/reduce slots from Hadoop 1

14

Page 15

YARN Resource Request

15

Resource Model •  Ask for a specific amount of resources (memory,

CPU, etc.) on a specific machine or rack •  Capabilities define how much memory and CPU is

requested. •  Relax Locality = false to force containers onto

subsets of machines aka YARN node labels.

ResourceRequest

priority

resourceName

capability

numContainers

relaxLocality

Page 16

YARN Capacity Scheduler

Page 16

•  Elasticity •  Queues to subdivide resources •  Job submission Access Control Lists

Capacity Sharing

FUNCT

ION

•  Max capacity per queue •  User limits within queue •  Preemption

Capacity Enforcement

FUNCT

ION

•  Ambari Capacity Scheduler View AdministraWon

FUNCT

ION

Page 17

Hierarchical Queues

17

root

Adhoc 10%

DW 70%

Mrk2ng 20%

Dev 10%

Reserved 20%

Prod 70%

Prod 80%

Dev 20%

P0 70%

P1 30%

Parent

Leaf

Page 18

YARN capacity scheduler helps manage resources across the cluster

Page 19

NodeManager NodeManager NodeManager NodeManager

Container 1.1

Container 2.4



Container 1.2

Container 1.3

AM 1

Container 2.2

Container 2.1

Container 2.3

AM2

YARN Application Submission - Walkthrough

Client2

ResourceManager

Scheduler

Page 20

Spark: Processing Internals Review

2015

Page 21

First, a bit of review - What is Spark?

• Distributed runtime engine for fast large scale data processing.

• Designed for iterative computations and interactive data mining.

• Provides a API framework to support In-Memory Cluster Computing.

• Multi-language support – Scala, Java, Python

Page 22

So what makes Spark fast? Data access methods are not equal!

Page 23

MapReduce vs Spark

• MapReduce – On disk

• Spark – In memory

Page 24

RDD – The main programming abstraction

Resilient Distributed Datasets •  Collections of objects spread

across a cluster, cached or stored in RAM or on Disk

•  Built through parallel transformations

•  Automatically rebuilt on failure •  Immutable, each transformation

creates a new RDD

Operations •  Lazy Transformations"

(e.g. map, filter, groupBy) •  Actions"

(e.g. count, collect, save)

Page 25

RDD In Action

RDDRDDRDDRDD

Transformations

Action Value

linesWithSpark = textFile.filter(lambda line: "Spark” in line) !

linesWithSpark.count()!74!!linesWithSpark.first()!# Apache Spark!

textFile = sc.textFile(”SomeFile.txt”) !

Page 26

RDD Graph

map map reduceByKey collect textFile

.flatMap(line=>line.split(" "))

.reduceByKey(_ + _, 3)

.collect()

RDD[String]

RDD[List[String]]

RDD[(String, Int)]

Array[(String, Int)]

RDD[(String, Int)] .map(word=>(word, 1)))

Page 27

DAG Scheduler

map map reduceByKey collect textFile

map

Stage 2 Stage 1

map reduceByKey collect textFile

Goals •  Split graph into stages

based on the types of transformations

•  Pipe-line narrow transformations (transformations without data movement) into a single stage

Page 28

DAG Scheduler - Double Click

map

Stage 2 Stage 1

map reduceByKey collect textFile

Stage 2 Stage 1

Stage 1 1.  Read HDFS split 2.  Apply both maps 3.  Write shuffle data

Stage 2 1.  Read shuffle data 2.  Final reduce 3.  Send result to driver

Page 29

Tasks – How work gets done

Execute task

Fetch input

Write output

The fundamental unit of work in Spark 1.  Fetch input based on the InputFormat or a shuffle. 2.  Execute the task. 3.  Materialize task output via shuffle, write, or a result to

the driver.

Page 30

Input Formats control task input

• Hadoop InputFormats control how data on HDFS is read into each task. – Controls Splits – how data is split up – each task (by default) gets one split, which is typically

a single HDFS block – Controls the concept of a Record – is a record a whole line? A single word? An XML

element? • Spark can use both the old and new API InputFormats for creating RDD.

– newAPIHadoopRDD and hadoopRDD – Save time, use Hadoop InputFormats versus writing a custom RDD

Page 30

Page 31

Executor – The Spark Worker

Isolation for tasks 1.  Each application gets it’s own executors. 2.  Executors run tasks in threads and cache data. 3.  Run in separate processes for isolation. 4.  Lives for the duration of the application.

Page 32

Executor – The Spark Worker

Execute task Fetch input

Write output


Write output


Write output Execute task

Fetch input

Write output Execute task

Fetch input

Write output


Write output


Write output

Core 1

Core 2

Core 3

task task

task task

task task task

EXECUTOR!

Page 33

The gangs all here

Application Master

Spark Driver

Executor

Worker Node

Task

RDD Partition

Cache

Task

RDD Partition

Executor

Worker Node

Task

RDD Partition

Cache

Task

RDD Partition

Executor

Worker Node

Task

RDD Partition

Cache

Task

RDD Partition

Executor

Worker Node

Task

RDD Partition

Cache

Task

RDD Partition

Page 34

Spark: on YARN

2015

Page 35

Spark on YARN

Modus Operandi •  1 executor = 1 yarn container •  2 modes: yarn-client or yarn-cluster •  yarn-client = driver on the client side – good for the REPL •  yarn-cluster = driver inside the YARN application master

(below) – good for batch and automated jobs

YARN RM

App Master

Monitoring UI

Page 36

Why Spark on YARN

Core Features •  Run other workloads along with Spark •  Leverage Spark Dynamic Resource Allocation •  Currently the only way to run in a kerberized environment •  Ability to provide capacity guarantees via Capacity Scheduler

Hortonworks Data PlaBorm 2.2

YARN: Data Operating System (Cluster Resource Management)

1 ° ° ° ° ° ° °

° ° ° ° ° ° ° °

Apa

che

Pig

° °

° °

° ° °

° ° °

HDFS (Hadoop Distributed File System)

GOVERNANCE BATCH, INTERACTIVE & REAL-TIME DATA ACCESS

Apache Falcon

Apa

che

Hiv

e C

asca

ding

A

pach

e H

Bas

e A

pach

e A

ccum

ulo

Apa

che

Sol

r A

pach

e S

park

Apa

che

Sto

rm

Apache Sqoop

Apache Flume

Apache Kafka

SECURITY

Apache Ranger

Apache Knox

Apache Falcon

OPERATIONS

Apache Ambari

Apache Zookeeper

Apache Oozie

Page 37

Executor Allocations on YARN

Static Allocation •  Static number of executors started on the cluster. •  Executors live for the duration of the application,

even when idle. Dynamic Allocation •  Minimal number of executors started initially. •  Executors added exponentially based on pending

tasks. •  After an idle period, executors are stopped and

resources are returned to the resource pool.

Page 38

Static Allocation Details

Static Allocation •  Traditional means of starting executors on nodes.

spark-shell --master yarn-client \ --driver-memory 3686m \ --executor-memory 17g \ --executor-cores 7 \ --num-executors 7

•  Static number of executors specified by the submitter. •  Size and count of executors is key for good

performance.

Page 39

Dynamic Allocation Details

Dynamic Allocation •  Scale executor count based on pending tasks

spark-shell --master yarn-client \ --driver-memory 3686m \ --executor-memory 3686m \ --executor-cores 1 \ --conf "spark.dynamicAllocation.enabled=true" \ --conf "spark.dynamicAllocation.minExecutors=1" \ --conf "spark.dynamicAllocation.maxExecutors=100" \ --conf "spark.shuffle.service.enabled=true"

•  Minimum and maximum number of executors specified.

•  Exclusive to running Spark on YARN

Page 40

Enabling Dynamic Allocation

spark_shuffle YARN aux service Dynamic allocation is not enabled OOTB

--conf "spark.dynamicAllocation.enabled=true" \ --conf "spark.shuffle.service.enabled=true"

1.  Copy the spark-shuffle jar onto the NodeManager classpath.

2.  Configure the YARN aux service for spark_shuffle

Add: spark_shuffle to yarn.nodemanager.aux-services Add: yarn.nodemanager.aux-service.spark_shuffle.class =

Org.apache.spark.network.yarn.YarnShuffleService

3.  Restart the NodeManagers to pick up the spark-shuffle jar.

4.  Run the spark job with the dynamic allocation configs.

Page 41

Dynamic Allocation Configuration Options

spark.dynamicAllocation.minExecutors Minimum number of executors, also the initial number to be spawned at

job submission. (can override initial count with initialExecutors) --conf "spark.dynamicAllocation.minExecutors=1”

spark.dynamicAllocation.maxExecutors Maximum number of executors, executors will be added

based on pending tasks up to this maximum. --conf "spark.dynamicAllocation.maxExecutors=100”

Page 42

Dynamic Allocation Configuration Options

spark.dynamicAllocation.sustainedSchedulerBacklogTimeout After the initial round of executors are scheduled, how long until the next

round of scheduling? Default: 5 seconds.

--conf "spark.dynamicAllocation.schedulerBacklogTimeout=10”

spark.dynamicAllocation.schedulerBacklogTimeout Initial Delay to wait before allocating additional executors.

Default: 5 seconds

--conf "spark.dynamicAllocation.sustainedSchedulerBacklogTimeout=10”

E

Executors Started over Time

EE

E

E E

E E

E

E

E

E

E

E

E

E

Page 43

Dynamic Allocation – Good citizenship in a shared environment

spark.dynamicAllocation.executorIdleTimeout Amount of idle time in seconds before a executor container is

killed and resource returned to YARN. Default: 10 minutes --conf "spark.dynamicAllocation.executorIdleTimeout=60”

spark.dynamicAllocation.cachedExecutorIdleTimeout Because caching RDDs is key to performance, this setting has been

introduced to keep executors with cached data around longer.

--conf "spark.dynamicAllocation.cachedExecutorIdleTimeout=1800”

Page 44

Sizing your Spark job

Difficult Landscape •  Conflicting recommendations often found online. •  Need knowledge of the data set, task distribution,

cluster topology, RDD cache churn, hardware profile….

1 executor per core?

It Depends

1 executor per node?

3-5 executors if I/O bound?

yarn.nodemanager.resource.memory-mb?

18gb max heap?

Page 45

Commons Suggestions to improve performance

Do these things 1.  Cache RDDs in memory* 2.  Don’t spill to disk if possible 3.  Use a better serializer 4.  Consider compression 5.  Limit GC activity 6.  Get parallelism right*

1.  … or scale elastically

* New considerations with Spark on YARN

Page 46

Sizing Spark Executors on YARN

Relationship 1.  Setting the executor memory size is setting the JVM heap, NOT the container. 2.  Executor memory + the greater of (10% or 384mb) = container size. 3.  To avoid wasted resources, ensure Executor memory + memoryOverhead <

yarn.scheduler.minimum-allocation-mb

Page 47

Sizing Spark Executors on YARN

Relevant YARN Container Settings •  yarn.nodemanager.resource.cpu-vcores

–  Number of vcores availble for YARN containers per nodemanager •  yarn.nodemanager.resource.memory-mb

–  Total memory available for YARN containers per nodemanager •  yarn.scheduler.minimum-allocation-mb

–  Minimum resource request allowed per allocation in megabytes. –  Smallest container available for an executor

•  yarn.scheduler.maximum-allocation-mb –  Maximum resource request allowed per allocation in megabytes. –  Largest container available for an executor –  Typically equal to yarn.nodemanager.resource.memory-mb

Page 48

Tuning Advice

How do we get it right? •  Test, gather, and test some more •  Define a SLA! •  Tune the job, not the cluster •  Tune the job to meet SLA! •  Don’t tune prematurely, it’s the root of all evil

Starting Points •  Keep your heap reasonable, but large enough to

handle your dataset. –  Recall that we only get about 60% of the heap for

RDD caching. –  Measure GC and ensure the percent of time spent

here is low. •  For jobs that heavily depend on cached RDDs,

limit executors per machine to one where possible –  See the first point, if RDD cache churn or GC are a

problem, make smaller executors and run multiple per machine.

Starting Points •  High memory hardware, multiple executors per

machine. –  Keep the heap reasonable

•  For CPU bound tasks with limited data needs, more executors can be better

–  Run with 2-4GB executors with a single vcore and measure performance.

•  Tune task parallelism –  As a rule of thumb, increase the task count by 1.5x

each round of testing and measure the results.

Page 49

Avoid spilling or caching to disk

Caching strategies •  Use the default .cache() or .persist() which stores data as deserialized java

objects (MEMORY_ONLY). –  Trade off: Lower CPU usage versus size of data in memory.

•  Don’t use disk persistence. –  It’s typically faster to recompute the partition and there is a good chance many of the

blocks are still in the Operating System page cache. •  If the default strategy results in the data not fitting in memory, use

MEMORY_ONLY_SER, which stores the data as serialized objects. –  Trade off: Higher CPU usage but data set is typically around 50% smaller in memory. –  Can result in significant impacts to the job run time for larger data sets, use with caution.

import org.apache.spark.storage.StorageLevel._ theRdd.persist(MEMORY_ONLY_SER)

Page 50

Data Access with Spark on YARN

Gotchas •  Don’t cache base RDDs, poor distribution.

–  Do cache intermediate data sets, good distribution across dynamically allocated executors.

•  Ensure executors remain running until you are done with the cached data. –  Cached data goes away when the executors do, costly to recompute.

•  Data locality is getting better, but isn’t great. –  SPARK-1767 introduced locality waits for cached data.

•  computePreferredLocations is pretty broken. –  Only use if necessary, gets overwritten in some scenarios, better

approaches in the works.

val locData = InputFormatInfo.computePreferredLocations(Seq( new InputFormatInfo(conf, classOf[TextInputFormat], new Path("myfile.txt"))) val sc = new SparkContext(conf, locData)

Page 51

Future Improvements for Spark on YARN

RDD Sharing –  Short term: Keep around executors with RDD cache longer –  HDFS Memory Tier for RDD caching –  Experimental Off-heap caching in Tachyon (lower overhead than persist()) –  Cache rebalancing

Data Locality for Dynamic Allocation –  No more preferredLocations, discover locality from RDD lineage.

Container/Executor Sizing –  Make it easier… automatically determine the appropriate size. –  Long term: specify task size only and memory, cores, and overhead are determined

automatically. Secure All The Things!

–  SASL for shuffle data –  SSL for the HTTP endpoints –  Encrypted Shuffle – SPARK-5682

Page 52

DEMO: Scaling Spark workloads on YARN

2015

Page 53

Scaling compute independent of storage

HDP 2.3 Hadoop Cluster

Storage Nodes

Storage Node

NodeMgr

HDFS

Storage Node

NodeMgr

HDFS

Storage Node

NodeMgr

HDFS

Edge Node

Clients

Compute Node

NodeMgr

Compute Node

NodeMgr

Compute Node

NodeMgr

Compute Node

NodeMgr

Compute Nodes

Mgmt & Master Nodes

Ambari Node

Ambari

Master Node

Masters

Master Node

Masters

Master Node

Masters

Overview 1.  Pattern that is gaining

popularity in the cloud. 2.  Save costs and leverage the

elasticity of the cloud. 3.  Scale NodeManagers

(compute only) independent of traditional Nodemanager/Datanode (compute + storage) workers.

Page 54

How it works?

Overview 1.  Leverage Spark Dynamic

Allocation on YARN to scale number of executors based on pending work.

2.  If additional capacity is still needed, provision additional compute nodes, add them to the cluster, and continue to scale executors onto the new nodes.

HDP 2.3 Hadoop Cluster

Storage Nodes

Storage Node

NodeMgr

HDFS

Storage Node

NodeMgr

HDFS

Storage Node

NodeMgr

HDFS

Compute Node

NodeMgr

Compute Node

NodeMgr

Compute Node

NodeMgr

Compute Node

NodeMgr

Compute Node

NodeMgr

Compute Node

NodeMgr

Compute Node

NodeMgr

Compute Node

NodeMgr

Compute Nodes

Edge Node

Clients

Mgmt & Master Nodes

Ambari Node

Ambari

Master Node

Masters

Master Node

Masters

Master Node

Masters

Page 55

HDP/Spark ClusterCloudbreak

Ambari

Orchestration(REST API) Metrics

Spark Client

Compute Nodes

Container

Executor

Container

Executor

Container

Executor

Container

Executor

Process Overview

+Container

Executor

Container

Executor

+

More Compute!

Container

Executor

Container

Executor

Container

Executor

Container

Executor

1 Deploy Cluster

2 Set Alerts

3 Submit Job4 Executors Increase

5 Capacity reached, Alerts trigger

6 Scaling Policy adds compute nodes

Page 56

DEMO – Leveraging Dynamic Allocation

Page 57

Scenarios

Promising Use Cases 1.  CPU bound workloads 2.  Burst-y usage 3.  Zeppelin/ad-hoc data exploration 4.  Multi-tenant, multi-use, centralized cluster 5.  Dev/QA clusters

Page 58

Cloudbreak

•  Developed by SequenceIQ •  Open source with options to extend

with custom UI •  Launches Ambari and deploys

selected distribution via Blueprints in Docker containers

•  Customer registers, delegates access to cloud credentials, and runs Hadoop on own cloud account (Azure, AWS, etc.)

•  Elastic – Spin up any number of nodes, up/down scale on the fly

“Cloud agnostic Hadoop As-A-Service API”

Page 59

BI / AnalyWcs (Hive)

IoT Apps (Storm, HBase, Hive)

Launch HDP on Any Cloud for Any Application

Dev / Test (all HDP services)

Data Science (Spark)

Cloudbreak 1.  Pick a Blueprint 2.  Choose a Cloud 3.  Launch HDP!

Example Ambari Blueprints: IoT Apps, BI / Analy2cs, Data Science,

Dev / Test

Page 60

Step 1: Sign up for a free Cloudbreak account

Page 60

URL to sign up for a free account:"https://accounts.sequenceiq.com/ ""General Cloudbreak documentation:"http://sequenceiq.com/cloudbreak/#cloudbreak

Page 61

• Varies by cloud, but typically only a couple of steps.

Page 61

Step 2: Create or add credentials

Page 62

Step 3: Note the blueprint for your use case

• An Ambari blueprint describes components of the HDP stack to include in the cloud deployment

• Cloudbreak comes with some default blueprints, such as a Spark cluster or a streaming architecture

• Pick the appropriate blueprint, or create your own!

Page 62

Page 63

Step 4: Create Cluster

• Ensure your credential is selected by clicking on “select a credential”

• Click Create cluster, give it a name, choose a region, choose a network

• Choose desired blueprint

• Set the instance type and number of nodes.

• Click create and start cluster

Page 63

Page 64

Step 5: Wait for cluster install to complete

• Depending on instance types and blueprint chosen, cluster install should complete in 10-35 mins

• Once cluster install is complete, click on the Ambari server address link (highlighted on screenshot) and login to Ambari with admin/admin

• Your HDP cluster is ready to use

Page 64

Page 65

Periscope: Auto up and down scaling

• Define alerts for the number of pending YARN containers.

Page 65

Page 66


• Define scaling policies for how Periscope should react to the defined alerts.

Page 66

Page 67


• Define the min/max cluster size and “cooldown” period (how long to wait between scaling events).

Page 67

• The number of compute nodes will automatically scale when out of capacity for containers.

Page 68

Benefits

Why do I care? •  Less contention between jobs

–  Less waiting for your neighbors job to finish, elastic scale gives us all compute time.

•  Improved job run times. –  Testing has shown a 30%+ decrease in job run times for moderate

duration CPU bound jobs. •  Decreased costs over persistent IaaS clusters

–  Spin down resources not in use. –  If time = money, improve job run times will decrease costs.

•  Capacity planning hack! –  Scaling up a lot? You should probably add more capacity… –  Never scaling up? You probably overbuilt…

Page 69

DEMO – Auto Scaling IaaS

Page 70

Q & A

Software

Scaling Spark Workloads on YARN - Boulder/Denver July 2015