49
CS162 Operating Systems and Systems Programming Lecture 24 Capstone: Cloud Computing April 30, 2014 Anthony D. Joseph http://inst.eecs.berkeley.edu/~cs162

CS162 Operating Systems and Systems Programming Lecture 24 Capstone: Cloud Computing April 30, 2014 Anthony D. Joseph cs162

Embed Size (px)

Citation preview

Page 1: CS162 Operating Systems and Systems Programming Lecture 24 Capstone: Cloud Computing April 30, 2014 Anthony D. Joseph cs162

CS162Operating Systems andSystems Programming

Lecture 24

Capstone: Cloud Computing

April 30, 2014

Anthony D. Joseph

http://inst.eecs.berkeley.edu/~cs162

Page 2: CS162 Operating Systems and Systems Programming Lecture 24 Capstone: Cloud Computing April 30, 2014 Anthony D. Joseph cs162

24.24/30/2014 Anthony D. Joseph CS162 ©UCB Spring 2014

Goals for Today

• Big data

• Cloud Computing programming paradigms

• Cloud Computing OS

Note: Some slides and/or pictures in the following areadapted from slides Ali Ghodsi.

Page 3: CS162 Operating Systems and Systems Programming Lecture 24 Capstone: Cloud Computing April 30, 2014 Anthony D. Joseph cs162

24.34/30/2014 Anthony D. Joseph CS162 ©UCB Spring 2014

Background of Cloud Computing

• 1980’s and 1990’s: 52% growth in performance per year!

• 2002: The thermal wall– Speed (frequency) peaks,

but transistors keepshrinking

• 2000’s: Multicore revolution– 15-20 years later than

predicted, we have hit the performance wall

• 2010’s: Rise of Big Data

Page 4: CS162 Operating Systems and Systems Programming Lecture 24 Capstone: Cloud Computing April 30, 2014 Anthony D. Joseph cs162

24.44/30/2014 Anthony D. Joseph CS162 ©UCB Spring 2014

Sources Driving Big DataIt’s All Happening On-

line

Every:ClickAd impressionBilling eventFast Forward, pause,…Friend RequestTransactionNetwork messageFault…

User Generated (Web & Mobile)

…..

Internet of Things / M2M Scientific Computing

Page 5: CS162 Operating Systems and Systems Programming Lecture 24 Capstone: Cloud Computing April 30, 2014 Anthony D. Joseph cs162

24.54/30/2014 Anthony D. Joseph CS162 ©UCB Spring 2014

Data Deluge

• Billions of users connected through the net– WWW, FB, twitter, cell phones, …

– 80% of the data on FB was produced last year

• Storage getting cheaper– Store more data!

Page 6: CS162 Operating Systems and Systems Programming Lecture 24 Capstone: Cloud Computing April 30, 2014 Anthony D. Joseph cs162

24.64/30/2014 Anthony D. Joseph CS162 ©UCB Spring 2014

Data Grows Faster than Moore’s Law

Projected Growth

Incr

ea

se

ove

r 2

010

Page 7: CS162 Operating Systems and Systems Programming Lecture 24 Capstone: Cloud Computing April 30, 2014 Anthony D. Joseph cs162

24.74/30/2014 Anthony D. Joseph CS162 ©UCB Spring 2014

Solving the Impedance Mismatch

• Computers not getting faster, and we are drowning in data

– How to resolve the dilemma?

• Solution adopted by web-scale companies

– Go massively distributed and parallel

Page 8: CS162 Operating Systems and Systems Programming Lecture 24 Capstone: Cloud Computing April 30, 2014 Anthony D. Joseph cs162

24.84/30/2014 Anthony D. Joseph CS162 ©UCB Spring 2014

Enter the World of Distributed Systems

• Distributed Systems/Computing– Loosely coupled set of computers, communicating through

message passing, solving a common goal

– Tools: Msg passing, Distributed shared memory, RPC

• Distributed computing is challenging– Dealing with partial failures (examples?)

– Dealing with asynchrony (examples?)

– Dealing with scale (examples?)

– Dealing with consistency (examples?)

• Distributed Computing versus Parallel Computing?– distributed computing=parallel computing + partial failures

Page 9: CS162 Operating Systems and Systems Programming Lecture 24 Capstone: Cloud Computing April 30, 2014 Anthony D. Joseph cs162

24.94/30/2014 Anthony D. Joseph CS162 ©UCB Spring 2014

The Datacenter is the new Computer

• “The datacenter as a computer” still in its infancy– Special purpose clusters, e.g., Hadoop cluster

– Built from less reliable components

– Highly variable performance

– Complex concepts are hard to program (low-level primitives)

=?

Page 10: CS162 Operating Systems and Systems Programming Lecture 24 Capstone: Cloud Computing April 30, 2014 Anthony D. Joseph cs162

24.104/30/2014 Anthony D. Joseph CS162 ©UCB Spring 2014

Datacenter/Cloud Computing OS

• If the datacenter/cloud is the new computer– What is its Operating System?

– Note that we are not talking about a host OS

• Could be equivalent in benefit as the LAMP stack was to the .com boom – every startup secretly implementing the same functionality!

• Open source stack for a Web 2.0 company: – Linux OS

– Apache web server

– MySQL, MariaDB or MongoDB DBMS

– PHP, Perl, or Python languages for dynamic web pages

Page 11: CS162 Operating Systems and Systems Programming Lecture 24 Capstone: Cloud Computing April 30, 2014 Anthony D. Joseph cs162

24.114/30/2014 Anthony D. Joseph CS162 ©UCB Spring 2014

Classical Operating Systems

• Data sharing– Inter-Process Communication, RPC, files, pipes, …

• Programming Abstractions– Libraries (libc), system calls, …

• Multiplexing of resources– Scheduling, virtual memory, file allocation/protection, …

Page 12: CS162 Operating Systems and Systems Programming Lecture 24 Capstone: Cloud Computing April 30, 2014 Anthony D. Joseph cs162

24.124/30/2014 Anthony D. Joseph CS162 ©UCB Spring 2014

Datacenter/Cloud Operating System

• Data sharing– Google File System, key/value stores

– Apache project: Hadoop Distributed File System

• Programming Abstractions– Google MapReduce

– Apache projects: Hadoop, Pig, Hive, Spark

• Multiplexing of resources– Apache projects: Mesos, YARN (MapReduce v2),

ZooKeeper, BookKeeper, …

Page 13: CS162 Operating Systems and Systems Programming Lecture 24 Capstone: Cloud Computing April 30, 2014 Anthony D. Joseph cs162

24.134/30/2014 Anthony D. Joseph CS162 ©UCB Spring 2014

Google Cloud Infrastructure

• Google File System (GFS), 2003– Distributed File System for entire

cluster– Single namespace

• Google MapReduce (MR), 2004– Runs queries/jobs on data– Manages work distribution & fault-

tolerance– Colocated with file system

• Apache open source versions: Hadoop DFS and Hadoop MR

Page 14: CS162 Operating Systems and Systems Programming Lecture 24 Capstone: Cloud Computing April 30, 2014 Anthony D. Joseph cs162

24.144/30/2014 Anthony D. Joseph CS162 ©UCB Spring 2014

GFS/HDFS Insights

• Petabyte storage– Files split into large blocks (128 MB) and replicated across

several nodes

– Big blocks allow high throughput sequential reads/writes

• Data striped on hundreds/thousands of servers– Scan 100 TB on 1 node @ 50 MB/s = 24 days

– Scan on 1000-node cluster = 35 minutes

Page 15: CS162 Operating Systems and Systems Programming Lecture 24 Capstone: Cloud Computing April 30, 2014 Anthony D. Joseph cs162

24.154/30/2014 Anthony D. Joseph CS162 ©UCB Spring 2014

GFS/HDFS Insights (2)

• Failures will be the norm– Mean time between failures for 1 node = 3 years

– Mean time between failures for 1000 nodes = 1 day

• Use commodity hardware– Failures are the norm anyway, buy cheaper hardware

• No complicated consistency models– Single writer, append-only data

Page 16: CS162 Operating Systems and Systems Programming Lecture 24 Capstone: Cloud Computing April 30, 2014 Anthony D. Joseph cs162

24.164/30/2014 Anthony D. Joseph CS162 ©UCB Spring 2014

MapReduce Insights

• Restricted key-value model– Same fine-grained operation (Map & Reduce) repeated

on big data

– Operations must be deterministic

– Operations must be idempotent/no side effects

– Only communication is through the shuffle

– Operation (Map & Reduce) output saved (on disk)

Page 17: CS162 Operating Systems and Systems Programming Lecture 24 Capstone: Cloud Computing April 30, 2014 Anthony D. Joseph cs162

24.174/30/2014 Anthony D. Joseph CS162 ©UCB Spring 2014

What is MapReduce Used For?

• At Google:– Index building for Google Search– Article clustering for Google News– Statistical machine translation

• At Yahoo!:– Index building for Yahoo! Search– Spam detection for Yahoo! Mail

• At Facebook:– Data mining– Ad optimization– Spam detection

Page 18: CS162 Operating Systems and Systems Programming Lecture 24 Capstone: Cloud Computing April 30, 2014 Anthony D. Joseph cs162

24.184/30/2014 Anthony D. Joseph CS162 ©UCB Spring 2014

MapReduce Pros• Distribution is completely transparent

– Not a single line of distributed programming (ease, correctness)

• Automatic fault-tolerance– Determinism enables running failed tasks somewhere else again

– Saved intermediate data enables just re-running failed reducers

• Automatic scaling– As operations as side-effect free, they can be distributed to any

number of machines dynamically

• Automatic load-balancing– Move tasks and speculatively execute duplicate copies of slow

tasks (stragglers)

Page 19: CS162 Operating Systems and Systems Programming Lecture 24 Capstone: Cloud Computing April 30, 2014 Anthony D. Joseph cs162

24.194/30/2014 Anthony D. Joseph CS162 ©UCB Spring 2014

MapReduce Cons

• Restricted programming model– Not always natural to express problems in this model

– Low-level coding necessary

– Little support for iterative jobs (lots of disk access)

– High-latency (batch processing)

• Addressed by follow-up research and Apache projects– Pig and Hive for high-level coding

– Spark for iterative and low-latency jobs

Page 20: CS162 Operating Systems and Systems Programming Lecture 24 Capstone: Cloud Computing April 30, 2014 Anthony D. Joseph cs162

24.204/30/2014 Anthony D. Joseph CS162 ©UCB Spring 2014

Administrivia• Project 4 code due next week Thu April 8 by 11:59pm

• MIDTERM #2 results TBA– Exam and solutions posted

• RRR week office hours: E-mail for an appointment

Page 21: CS162 Operating Systems and Systems Programming Lecture 24 Capstone: Cloud Computing April 30, 2014 Anthony D. Joseph cs162

24.214/30/2014 Anthony D. Joseph CS162 ©UCB Spring 2014

2min Break

Page 22: CS162 Operating Systems and Systems Programming Lecture 24 Capstone: Cloud Computing April 30, 2014 Anthony D. Joseph cs162

24.224/30/2014 Anthony D. Joseph CS162 ©UCB Spring 2014

Apache Pig

• High-level language:– Expresses sequences of MapReduce jobs

– Provides relational (SQL) operators(JOIN, GROUP BY, etc)

– Easy to plug in Java functions

• Started at Yahoo! Research– Runs about 50% of Yahoo!’s jobs

• https://pig.apache.org/

• Similar to Google’s (internal) Sawzall project

Page 23: CS162 Operating Systems and Systems Programming Lecture 24 Capstone: Cloud Computing April 30, 2014 Anthony D. Joseph cs162

24.234/30/2014 Anthony D. Joseph CS162 ©UCB Spring 2014

Example Problem

Given user data in one file, and website data in another, find the top 5 most visited pages by users aged 18-25

Load Users Load Pages

Filter by age

Join on name

Group on url

Count clicks

Order by clicks

Take top 5Example from http://wiki.apache.org/pig-data/attachments/PigTalksPapers/attachments/ApacheConEurope09.ppt

Page 24: CS162 Operating Systems and Systems Programming Lecture 24 Capstone: Cloud Computing April 30, 2014 Anthony D. Joseph cs162

24.244/30/2014 Anthony D. Joseph CS162 ©UCB Spring 2014

In MapReduce

Example from http://wiki.apache.org/pig-data/attachments/PigTalksPapers/attachments/ApacheConEurope09.ppt

Page 25: CS162 Operating Systems and Systems Programming Lecture 24 Capstone: Cloud Computing April 30, 2014 Anthony D. Joseph cs162

24.254/30/2014 Anthony D. Joseph CS162 ©UCB Spring 2014

In Pig Latin

Example from http://wiki.apache.org/pig-data/attachments/PigTalksPapers/attachments/ApacheConEurope09.ppt

Users = load ‘users’ as (name, age);Filtered = filter Users by age >= 18 and age <= 25; Pages = load ‘pages’ as (user, url);Joined = join Filtered by name, Pages by user;Grouped = group Joined by url;Summed = foreach Grouped generate group, count(Joined) as clicks;Sorted = order Summed by clicks desc;Top5 = limit Sorted 5;

store Top5 into ‘top5sites’;

Page 26: CS162 Operating Systems and Systems Programming Lecture 24 Capstone: Cloud Computing April 30, 2014 Anthony D. Joseph cs162

24.264/30/2014 Anthony D. Joseph CS162 ©UCB Spring 2014

Translation to MapReduce

Notice how naturally the components of the job translate into Pig Latin

Users = load …Filtered = filter …

Pages = load …Joined = join …Grouped = group …Summed = … count()…Sorted = order …Top5 = limit …

Example from http://wiki.apache.org/pig-data/attachments/PigTalksPapers/attachments/ApacheConEurope09.ppt

Load Users Load Pages

Filter by age

Join on name

Group on url

Count clicks

Order by clicks

Take top 5

Job 1

Job 2

Job 3

Page 27: CS162 Operating Systems and Systems Programming Lecture 24 Capstone: Cloud Computing April 30, 2014 Anthony D. Joseph cs162

24.274/30/2014 Anthony D. Joseph CS162 ©UCB Spring 2014

Apache Hive• Relational database built on Hadoop

– Maintains table schemas

– SQL-like query language (which can also call Hadoop Streaming scripts)

– Supports table partitioning, complex data types, sampling, some query optimization

• Developed at Facebook– Used for many Facebook jobs

• Now used by many others– Netfix, Amazon, …

• http://hive.apache.org/

Page 28: CS162 Operating Systems and Systems Programming Lecture 24 Capstone: Cloud Computing April 30, 2014 Anthony D. Joseph cs162

24.284/30/2014 Anthony D. Joseph CS162 ©UCB Spring 2014

Apache Spark Motivation

Complex jobs, interactive queries and online processing all need one thing that MR lacks:

Efficient primitives for data sharing

Iterative job

Query 1Query 1

Query 2Query 2

Query 3Query 3

Interactive mining

Stream processing

Page 29: CS162 Operating Systems and Systems Programming Lecture 24 Capstone: Cloud Computing April 30, 2014 Anthony D. Joseph cs162

24.294/30/2014 Anthony D. Joseph CS162 ©UCB Spring 2014

Spark Motivation

Complex jobs, interactive queries and online processing all need one thing that MR lacks:

Efficient primitives for data sharing

Iterative job

Query 1Query 1

Query 2Query 2

Query 3Query 3

Interactive mining

Stream processing

Problem: in MR, the only way to share data across jobs is using stable storage

(e.g. file system) slow!

Page 30: CS162 Operating Systems and Systems Programming Lecture 24 Capstone: Cloud Computing April 30, 2014 Anthony D. Joseph cs162

24.304/30/2014 Anthony D. Joseph CS162 ©UCB Spring 2014

Examples

iter. 1iter. 1 iter. 2iter. 2 . . .

Input

HDFSread

HDFSwrite

HDFSread

HDFSwrite

Input

query 1query 1

query 2query 2

query 3query 3

result 1

result 2

result 3

. . .

HDFSread

Opportunity: DRAM is getting cheaper use main memory for intermediate

results instead of disks

Page 31: CS162 Operating Systems and Systems Programming Lecture 24 Capstone: Cloud Computing April 30, 2014 Anthony D. Joseph cs162

24.314/30/2014 Anthony D. Joseph CS162 ©UCB Spring 2014

iter. 1iter. 1 iter. 2iter. 2 . . .

Input

Goal: In-Memory Data Sharing

Distributedmemory

Input

query 1query 1

query 2query 2

query 3query 3

. . .

one-timeprocessing

10-100× faster than network and disk

Page 32: CS162 Operating Systems and Systems Programming Lecture 24 Capstone: Cloud Computing April 30, 2014 Anthony D. Joseph cs162

24.324/30/2014 Anthony D. Joseph CS162 ©UCB Spring 2014

Solution: Resilient Distributed Datasets (RDDs)

• Partitioned collections of records that can be stored in memory across the cluster

• Manipulated through a diverse set of transformations (map, filter, join, etc)

• Fault recovery without costly replication– Remember the series of transformations that built an

RDD (its lineage) to recompute lost data

• http://spark.apache.org/

Page 33: CS162 Operating Systems and Systems Programming Lecture 24 Capstone: Cloud Computing April 30, 2014 Anthony D. Joseph cs162

24.334/30/2014 Anthony D. Joseph CS162 ©UCB Spring 2014

Page 34: CS162 Operating Systems and Systems Programming Lecture 24 Capstone: Cloud Computing April 30, 2014 Anthony D. Joseph cs162

24.344/30/2014 Anthony D. Joseph CS162 ©UCB Spring 2014

Security: Old Meets New

• Heard of vishing?

• A Voice over IP phishing attack

• Scenario– Victim receives a text message:

– Victim calls number

– Interactive Voice Response system prompts user to enter debit card number and PIN

– Criminals produce duplicate cards or use online and siphon $300/card/day

• April 2014: At least 2,500 cards stolen (up to $75K/day)

Urgent message from your bank! We’ve deactivated your debit card due to detected fraudulent activity. Please call 1-800-PHISHME to reactivate your card

Page 35: CS162 Operating Systems and Systems Programming Lecture 24 Capstone: Cloud Computing April 30, 2014 Anthony D. Joseph cs162

24.354/30/2014 Anthony D. Joseph CS162 ©UCB Spring 2014

How Hackers Perform Vishing

• Compromise a vulnerable server (anywhere in the world)– Install Interactive Voice Response (IVR) software

• Compromise a vulnerable VoIP server– Hijack the Direct Inward Dialing (DID) function to assign a

phone number to their IVR system

• Use free text-to-speech tools to generate recordings– Load onto IVR system

• Send spam texts using email-to-SMS gateways

• VoIP server redirects incoming calls to IVR system– IVR system prompts callers for card data and PIN

– Data saved locally or in a drop site

• Data encoded onto new cards for ATM or purchasing use– Also used for online/phone “card not present” transactions

http://blog.phishlabs.com/

Page 36: CS162 Operating Systems and Systems Programming Lecture 24 Capstone: Cloud Computing April 30, 2014 Anthony D. Joseph cs162

24.364/30/2014 Anthony D. Joseph CS162 ©UCB Spring 2014

2min Break

Page 37: CS162 Operating Systems and Systems Programming Lecture 24 Capstone: Cloud Computing April 30, 2014 Anthony D. Joseph cs162

24.374/30/2014 Anthony D. Joseph CS162 ©UCB Spring 2014

• Rapid innovation in datacenter computing frameworks

• No single framework optimal for all applications

• Want to run multiple frameworks in a single datacenter– …to maximize utilization

– …to share data between frameworks

Pig

Datacenter Scheduling Problem

Dryad

Pregel

Percolator

Ciel

Page 38: CS162 Operating Systems and Systems Programming Lecture 24 Capstone: Cloud Computing April 30, 2014 Anthony D. Joseph cs162

24.384/30/2014 Anthony D. Joseph CS162 ©UCB Spring 2014

Hadoop

Hadoop

PregelPregel

MPIMPIShared cluster

Today: static partitioning

Dynamic sharing

Where We Want to Go

Page 39: CS162 Operating Systems and Systems Programming Lecture 24 Capstone: Cloud Computing April 30, 2014 Anthony D. Joseph cs162

24.394/30/2014 Anthony D. Joseph CS162 ©UCB Spring 2014

Solution: Apache Mesos

MesosMesos

Node

Node

Node

Node

Node

Node

Node

Node

HadoopHadoop PregelPregel…

Node

Node

Node

Node

HadoopHadoop

Node

Node

Node

Node

PregelPregel

• Mesos is a common resource sharing layer over which diverse frameworks can run

• Run multiple instances of the same framework– Isolate production and experimental jobs

– Run multiple versions of the framework concurrently

• Build specialized frameworks targeting particular problem domains

– Better performance than general-purpose abstractions

Page 40: CS162 Operating Systems and Systems Programming Lecture 24 Capstone: Cloud Computing April 30, 2014 Anthony D. Joseph cs162

24.404/30/2014 Anthony D. Joseph CS162 ©UCB Spring 2014

Mesos Goals

• High utilization of resources

• Support diverse frameworks (current & future)

• Scalability to 10,000’s of nodes

• Reliability in face of failures

http://mesos.apache.org/

Resulting design: Small microkernel-like core that pushes scheduling

logic to frameworks

Page 41: CS162 Operating Systems and Systems Programming Lecture 24 Capstone: Cloud Computing April 30, 2014 Anthony D. Joseph cs162

24.414/30/2014 Anthony D. Joseph CS162 ©UCB Spring 2014

Mesos Design Elements

•Fine-grained sharing:– Allocation at the level of tasks within a job

– Improves utilization, latency, and data locality

•Resource offers:– Simple, scalable application-controlled scheduling

mechanism

Page 42: CS162 Operating Systems and Systems Programming Lecture 24 Capstone: Cloud Computing April 30, 2014 Anthony D. Joseph cs162

24.424/30/2014 Anthony D. Joseph CS162 ©UCB Spring 2014

Element 1: Fine-Grained Sharing

Framework 1Framework 1

Framework 2Framework 2

Framework 3Framework 3

Coarse-Grained Sharing (HPC): Fine-Grained Sharing (Mesos):

+ Improved utilization, responsiveness, data locality

Storage System (e.g. HDFS) Storage System (e.g. HDFS)

Fw. 1Fw. 1

Fw. 1Fw. 1Fw. 3Fw. 3

Fw. 3Fw. 3 Fw. 2Fw. 2Fw. 2Fw. 2

Fw. 2Fw. 2

Fw. 1Fw. 1

Fw. 3Fw. 3

Fw. 2Fw. 2Fw. 3Fw. 3

Fw. 1Fw. 1

Fw. 1Fw. 1 Fw. 2Fw. 2Fw. 2Fw. 2

Fw. 1Fw. 1

Fw. 3Fw. 3 Fw. 3Fw. 3

Fw. 3Fw. 3

Fw. 2Fw. 2

Fw. 2Fw. 2

Page 43: CS162 Operating Systems and Systems Programming Lecture 24 Capstone: Cloud Computing April 30, 2014 Anthony D. Joseph cs162

24.434/30/2014 Anthony D. Joseph CS162 ©UCB Spring 2014

Element 2: Resource Offers•Option: Global scheduler

– Frameworks express needs in a specification language, global scheduler matches them to resources

+ Can make optimal decisions– Complex: language must support all framework needs

– Difficult to scale and to make robust– Future frameworks may have unanticipated needs

Page 44: CS162 Operating Systems and Systems Programming Lecture 24 Capstone: Cloud Computing April 30, 2014 Anthony D. Joseph cs162

24.444/30/2014 Anthony D. Joseph CS162 ©UCB Spring 2014

Element 2: Resource Offers• Mesos: Resource offers

– Offer available resources to frameworks, let them pick which resources to use and which tasks to launch

+ Keeps Mesos simple, lets it support future frameworks

- Decentralized decisions might not be optimal

Page 45: CS162 Operating Systems and Systems Programming Lecture 24 Capstone: Cloud Computing April 30, 2014 Anthony D. Joseph cs162

24.454/30/2014 Anthony D. Joseph CS162 ©UCB Spring 2014

Mesos Architecture

MPI jobMPI job

MPI scheduler

MPI scheduler

Hadoop jobHadoop job

Hadoop schedulerHadoop

scheduler

Allocation module

Mesosmaster

Mesos slaveMesos slave

MPI executor

Mesos slaveMesos slave

MPI executor

tasktask

Resource offer

Resource offer

Pick framework to offer resources toPick framework to offer resources to

Page 46: CS162 Operating Systems and Systems Programming Lecture 24 Capstone: Cloud Computing April 30, 2014 Anthony D. Joseph cs162

24.464/30/2014 Anthony D. Joseph CS162 ©UCB Spring 2014

Mesos Architecture

MPI jobMPI job

MPI scheduler

MPI scheduler

Hadoop jobHadoop job

Hadoop schedulerHadoop

scheduler

Allocation module

Mesosmaster

Mesos slaveMesos slave

MPI executor

Mesos slaveMesos slave

MPI executor

tasktask

Resource offer

Resource offer

Pick framework to offer resources toPick framework to offer resources to

Resource offer = list of (node, availableResources)

E.g. { (node1, <2 CPUs, 4 GB>), (node2, <3 CPUs, 2 GB>) }

Resource offer = list of (node, availableResources)

E.g. { (node1, <2 CPUs, 4 GB>), (node2, <3 CPUs, 2 GB>) }

Page 47: CS162 Operating Systems and Systems Programming Lecture 24 Capstone: Cloud Computing April 30, 2014 Anthony D. Joseph cs162

24.474/30/2014 Anthony D. Joseph CS162 ©UCB Spring 2014

Mesos Architecture

MPI jobMPI job

MPI scheduler

MPI scheduler

Hadoop jobHadoop job

Hadoop schedulerHadoop

scheduler

Allocation module

Mesosmaster

Mesos slaveMesos slave

MPI executor

Hadoop executorHadoop executor

Mesos slaveMesos slave

MPI executor

tasktask

Pick framework to offer resources toPick framework to offer resources to

taskFramework-

specific scheduling

Framework-specific

scheduling

Resource offer

Resource offer

Launches and isolates executors

Launches and isolates executors

Page 48: CS162 Operating Systems and Systems Programming Lecture 24 Capstone: Cloud Computing April 30, 2014 Anthony D. Joseph cs162

24.484/30/2014 Anthony D. Joseph CS162 ©UCB Spring 2014

Deployments

Many 1,000’s of nodes running many production services

Genomics researchers using Hadoop and Spark on Mesos

Spark in use by Yahoo! Research

Spark for analytics

Hadoop and Spark used by machine learning researchers

Page 49: CS162 Operating Systems and Systems Programming Lecture 24 Capstone: Cloud Computing April 30, 2014 Anthony D. Joseph cs162

24.494/30/2014 Anthony D. Joseph CS162 ©UCB Spring 2014

Summary

• Cloud computing/datacenters are the new computer– Emerging “Datacenter/Cloud Operating System” appearing

• Many pieces of the DC/Cloud OS “LAMP” stack are available today:

– High-throughput filesystems (GFS/Apache HDFS)

– Job frameworks (MapReduce, Apache Hadoop, Apache Spark, Pregel)

– High-level query languages (Apache Pig, Apache Hive)

– Cluster scheduling (Apache Mesos)