Hadoop YARN - Under the Hoodarchive.apachecon.com/eu2012/presentations/06... · Node Manager...

Hadoop YARN - Under the Hood

Sharad Agarwal sharad@apache.org

About me

Recap: Hadoop 1.0 Map-Reduce JobTracker

Manages cluster resources and job scheduling

TaskTracker Per-node agent Manage tasks

YARN Architecture

ResourceManager

MapReduce Status

Job Submission

Client

NodeManager

Container

NodeManager

App Mstr

Node Status

Resource Request

ResourceManager

Client

MapReduce Status

Job Submission

Client

NodeManager

App Mstr Container

NodeManager

App Mstr

Node Status

Resource Request

ResourceManager

Client

MapReduce Status

Job Submission

Client

NodeManager

Container Container

NodeManager

App Mstr Container

NodeManager

Container App Mstr

Node Status

Resource Request

What the new Architecture gets us?

Scale Compute Platform

Scale for a compute platform •  Application Size

•  No of sub-tasks •  Application level state

•  eg. Counters

•  Number of Concurrent Tasks in a single cluster

Application size scaling in Hadoop 1.0

JTHeap!TotalTasks,Nodes, JobCounters

Application size scaling in YARN is by

Architecture

Why a limitation on cluster size ?

Cluster Utilization

Cluster Size

Hadoop 1.0

JobTracker JIP TIP Scheduler

Heartbeat Request

Heartbeat Response

•  Synchronous Heartbeat Processing

•  JobTracker Global Lock

JT transaction rate limit: 200 heartbeats/sec

Highly Concurrent Systems •  scales much better (if done

right) •  makes effective use of multi-

core hardware •  managing eventual

consistency of states hard •  need for a systemic framework

to manage this

Event Queue

Component A

Component B

Component N

Event Model

Event Dispatcher

•  Mutations only via events •  Components only expose Read APIs •  Use Re-entrant locks •  Components follow clear lifecycle

Heartbeat Listener Event Q

Heartbeat Request

Heartbeat Response

Asynchronous Heartbeat Handling

NodeManager Meta

Get commands

YARN: Better utilization bigger cluster

Cluster Utilization

Cluster Size

Hadoop 1.0

State Management

State management in JT Very Hard to Maintain Debugging even harder

Complex State Management •  Light weight State Machines Library •  Declarative way of specifying the state

Transitions •  Invalid transitions are handled automatically •  Fits nicely with the event model •  Debug-ability is drastically improved.

Lineage of object states can easily be determined

•  Handy while recovering the state

Declarative State Machine

High Availability

MR Application Master Recovery •  Hadoop 1.0

•  Application need to resubmit Job •  All completed tasks are lost

•  YARN •  Application execution state check pointed in

HDFS •  Rebuilds the state by replaying the events

Resource Manager HA •  Based on Zookeeper •  Coming Soon

•  YARN-128

YARN: New Possibilities

•  Open MPI - MR-2911 •  Master-Worker – MR-3315 •  Distributed Shell •  Graph processing – Giraph-13 •  BSP – HAMA-431 •  CEP

•  S4 – S4-25 •  Storm -

https://github.com/nathanmarz/storm/issues/74 •  Iterative processing - Spark

https://github.com/mesos/spark-yarn/

YARN - a solid foundation to take Hadoop to next level

Scale, High Availability, Utilization And

Alternate Compute Paradigms

Thank You

@twitter: sharad_ag

Hadoop YARN - Under the Hoodarchive.apachecon.com/eu2012/presentations/06... · Node Manager...

Documents

시스코코리아 최우형이사 · 2018. 6. 21. · [ DC Summit 2018 ] Cisco Container Platform 기술아키텍쳐 K8s Master K8s Master n K8s Node K8s Node K8s Node Persistent

OpenShift Container Platform 4 - access.redhat.com...unavailable without affecting the applications running on the cluster. 1.3. MASTER NODE SIZING The master node resource requirements

Intoduction to MSTR

rajavel mstr linkedin

Continuous Kubernetes Security - Alfresco...Continuous Delivery Hardening the Kubernetes Control Plane Nodes Master Node 3 OS Container Runtime Kubelet Networking Node 2 OS Container

Oracle application container cloud back end integration using node final

NSX Container Plug-in for OpenShift - Installation and … · 2019-01-18 · Install NSX Node Agent 24 Configmap for ncp.ini in nsx-node-agent-ds.yml 25 Install NSX-T Container Plug-in

Mstr. Carmita Coronado*

Red Hat OpenShift Container Platform · Red Hat OpenShift Container Platform ... sosreport -k docker.all=on -k docker.logs=on … on both masters and nodes Master, Node & Pod Logs:

ContainerCon 2016 - Jimenez, Arya @ijimene isabel ... · Process Migration Move a running process from one node to another Container Migration Move a running container from one node

Containerplattform - Lego für DevOps · 3Master(odermehr) Synchronisationder Konguration Node-Ausfälleautomatisch abfangen Container-Ausfälledurch Healthchecksabfangen API-Server

Cat MSTR SET-UP · Title: Cat MSTR SET-UP Author: Michael Swan Created Date: 5/25/2006 12:26:18 PM

Mstr Notes & Faq's-1

MSTR Project Design Guide

cos math mstr 22 - Virginia Tech

THE STORAGE LAYER IN THE NVIDIA DGX SUPERPOD€¦ · Deep Learning Model: Model hyperparameters tuned for multi-node scaling Multi-node launcher scripts Deep Learning Container: Deep

MSTR Connectivity

SEPTEMBER FSM.HI-RES MSTR

Maximizing Network Throughput for Container Based Storage · Maximizing Network Throughput for Container Based Storage ... kube-system calico -node-thwmw 2/2 Running 0 41m 172.17.4.101

A Nodal Network Can Be Constructed Graphically Using …A Nodal Network Can Be Constructed Graphically Using The ICPR “Network Builder”. Link Container Node Container Basins Assigned