Maintaining Low Latency While Maximizing Throughput on a Single Cluster

Preview:

Citation preview

®© 2015 MapR Technologies 1

®

© 2014 MapR Technologies

Maintaining Low Latency while Maximizing Throughput Yuliya Feldman February 19, 2015

®© 2015 MapR Technologies 2

Top-Ranked NoSQL

Top-Ranked Hadoop Distribution

Top-Ranked SQL-on-Hadoop Solution

®

®© 2015 MapR Technologies 3

What We Have – Cluster per Use Case

YARN cluster Web Servers YARN cluster

Too much isolation and poor resource utilization

®© 2015 MapR Technologies 4

Need Datacenter-wide Resource Manager What choices do we have?

•  YARN (capacity/fair scheduler)

•  Omega

•  Mesos

•  Others (e.g. Quasar)

®© 2015 MapR Technologies 5

YARN •  Motivated by Mesos, but is a Hadoop resource manager

•  Manages Hadoop resources well – “retail”

•  Pluggable schedulers for Hadoop

•  Started handling long-lived tasks

•  Can pre-empt tasks

•  YARN-1051 - YARN Admission Control/Planner: enhancing the resource allocation model with time

®© 2015 MapR Technologies 6

Mesos

•  Data-center wide resource manager – negotiator between

frameworks

•  Manages all resources for frameworks well, not particular framework (e.g. Hadoop) – “wholesale”

•  Doing two-level scheduling

•  Excellent Docker support

•  Schedules, allocates, and isolates cpu, mem, disk, network, and arbitrary custom resource types

®© 2015 MapR Technologies 7

Can we….

–  Continue leveraging YARN resource scheduling capabilities for YARN-based applications?

–  Treat YARN as “yet another” framework within Mesos?

–  Let YARN not bother about non-YARN applications coexistence?

®© 2015 MapR Technologies 8

Introducing Myriad

®© 2015 MapR Technologies 9

Apache Myriad: True Multi-tenancy

•  Open-source project launched Oct `14 –  MapR, eBay, Mesosphere, others participating

•  Allows Mesos and YARN to cooperate with each other •  Mesos: datacenter-wide resource manager

–  Dockerized containers and/or cgroups used for isolation

•  Hadoop is launched inside cgroup containers •  Myriad manages conversation between RM and Mesos master

and between NM and Mesos slaves

®© 2015 MapR Technologies 10

Why Myriad •  Run many types of compute frameworks side-by-side

–  Hadoop family, etc. (YARN, Spark, Kafka, Storm) –  Web-server farm –  MPP databases (e.g., Vertica) –  Other services: SOA web-services, Jenkins/build-farm, cron-jobs, shell

scripts, Kubernetes, Cassandra, ElasticSearch, etc. –  Each compute framework is a cluster in itself

•  Need to break up a physical cluster into many virtual clusters –  Using Docker (containers) for good isolation –  But most schedulers can only manage individual nodes inside a cluster

•  Move resources between virtual clusters on-demand

®© 2015 MapR Technologies 11

Utilize Excess Capacity for Analytics DC Server Farm Hadoop Analytics

Util

izat

iion

Long lived excess capacity situations

•  “Scale up” Hadoop during long periods of low utilization •  “Scale down” Hadoop ahead of anticipated high utilization

®© 2015 MapR Technologies 12

Myriad Again

•  Mesos creates virtual clusters

•  YARN uses resources provided by Mesos

•  Myriad can ask YARN to release some resources

•  Or give it more

Mesos

YARN cluster YARN cluster

Web Servers

®© 2015 MapR Technologies 13

Myriad Services Architecture

Node Manager Resource Manager

Executor Mesos

Scheduler

Mesos

Container

Container

App

YARN Scheduler (fairshare)

Offers

Launch Tasks

Launch Tasks

Task Status

Launch containers via HB

Submit

Map<Node, Capacity>

®© 2015 MapR Technologies 14

REST API

Framework +

Master

2.

Mesos Resource Manager

YARN

Mesos Slave

Mesos

Node

Node Manager

YARN

Launch Node Manager

2.5 CPU,2.5 GB

Advertise Resources

2 CPU,2 GB

How it works Mesos

scheduler

®© 2015 MapR Technologies 15

REST API

Framework +

Master

2.

Mesos Resource Manager

YARN

Mesos Slave

Mesos

Node

Node Manager

YARN

Launch Containers

C1

C2

Mesos scheduler

®© 2015 MapR Technologies 16

2.

Slave

Mesos

Node1

Node Manager

YARN

8 CPU, 8 GB

2.

Slave

Mesos

Node2

Node Manager

YARN

8 CPU,8 GB

REST API

Framework +

Master

Mesos Resource Manager

YARN

Web Traffic spike

Resize NodeManager(s)

6 CPU, 6 GB 6 CPU, 6 GB

WebService 2 CPU, 2 GB 2 CPU, 2 GB

WebService

Use Case – Web Traffic spikes

Mesos scheduler

®© 2015 MapR Technologies 17

2.

Slave

Mesos

Node1

Node Manager

YARN

8 CPU, 8 GB

2.

Slave

Mesos

Node2

Node Manager

YARN

8 CPU,8 GB

REST API

Framework +

Master

Mesos Resource

Manager

YARN

Web Traffic spike over

Resize NodeManager(s)

6 CPU, 6 GB 6 CPU, 6 GB

WebService 2 CPU, 2 GB 2 CPU, 2 GB

WebService

Mesos scheduler

®© 2015 MapR Technologies 18

Myriad Demo

At MapR booth 1009

®© 2015 MapR Technologies 19

Maintaining Low Latency while Maximizing Throughput

on a single cluster

®© 2015 MapR Technologies 20

Batch and Real-time Analytics Together

Compute Cluster

NM DrillBit

NM

DrillBit

NM

DrillBit

NM

DrillBit

NM DrillBit

NM

DrillBit

NM

DrillBit

NM

DrillBit

Cluster/DC Scheduler

®© 2015 MapR Technologies 21

Sharing Resources between Batch and Real-Time

•  Real-time services resource usage pattern can be unpredictable

–  Analysts use services during the day

–  Analysts on the other side of the globe work during the night

–  There are steady states, spikes and dips in the workloads

•  Batch resource usage – more or less predictable

–  Running same jobs all over again with some occasional spikes and dips

®© 2015 MapR Technologies 22

Real-time Services Resource Utilization/Provisioning

Aggressive resource provisioning. < 10% utilization

Moderate resource provisioning < 60% utilization

Conservative resource provisioning > 80% utilization

®© 2015 MapR Technologies 23

What Can We Do To Provision Conservatively?

Compute Cluster

NM DrillBit

NM

DrillBit

NM

DrillBit

NM

DrillBit

Cluster/DC ResourceManager

Drill Service Watcher

Monitors Drill

Performance

Latency decrease

Accept Offers (Mesos) Need additional Containers (YARN)

Allocate Resources (Preempt if

needed)

C1

C2 C3

Dummy containers

Latency increase

®© 2015 MapR Technologies 24

SHOWTIME

®© 2015 MapR Technologies 25

®© 2015 MapR Technologies 26

Q & A

@mapr maprtech

yfeldman@mapr.com

Engage with us!

MapR

maprtech

mapr-technologies

Recommended