18
Building a Machine Learning Orchestration Framework on Mesos 0 Antony Arokiasamy | Kedar Sadekar | Personalization Infrastructure

Meson: Building a Machine Learning Orchestration Framework on Mesos

Embed Size (px)

Citation preview

Page 1: Meson: Building a Machine Learning Orchestration Framework on Mesos

Building a Machine Learning Orchestration Framework on Mesos

0

Antony Arokiasamy | Kedar Sadekar | Personalization Infrastructure

Page 2: Meson: Building a Machine Learning Orchestration Framework on Mesos

1

Help members find content to watch and enjoy to maximize member satisfaction and retention

Page 3: Meson: Building a Machine Learning Orchestration Framework on Mesos

Everything is a Recommendation2

Recommendations are driven by Machine Learning

Ranking

Row

s

Page 4: Meson: Building a Machine Learning Orchestration Framework on Mesos

Machine Learning Pipeline3

User Selection

Feature Generation

Model Validation

PublishModel

Model Training

Page 5: Meson: Building a Machine Learning Orchestration Framework on Mesos

Machine Learning Pipeline Challenges

4

• Innovation• Heterogeneous Environments

• Spark• Native Support

• Separate Orchestration and Execution

• Multi Tenancy

• ML Constructs• Parameter Sweep – 30k Dockers

Page 6: Meson: Building a Machine Learning Orchestration Framework on Mesos

Meson Workflow System in 30 seconds

5

• General Purpose Workflow Orchestration and Scheduling framework• Delegates execution to resource managers like Mesos

• Optimized for Machine Learning Pipelines and Visualization

• Checkout the Blog• bit.ly/mesonws or techblog.netflix.com

Page 7: Meson: Building a Machine Learning Orchestration Framework on Mesos

Meson Architecture6

Page 8: Meson: Building a Machine Learning Orchestration Framework on Mesos

Mesos Usage7

• Executors• Custom Executor• Executor Caching• Executor Cleanup

• Framework Messages

• Resource Attributes• Multi Tenancy• Cluster Management

Page 9: Meson: Building a Machine Learning Orchestration Framework on Mesos

Custom Executors8

• Reuse Executor Process• e.g. Spark• Executor Id = <unique id>

• Two Way Communication

Page 10: Meson: Building a Machine Learning Orchestration Framework on Mesos

Executor Caching9

Page 11: Meson: Building a Machine Learning Orchestration Framework on Mesos

Executor Caching10

• Executor Id = hash(<something unique for the class of executors>)• E.g. Executor Id = hash(classpath)

• Match with Executor Id in Offer

offers

accept

Page 12: Meson: Building a Machine Learning Orchestration Framework on Mesos

Executor Cleanup11

• Expiration

• Explicitly keep track of Executors

Page 13: Meson: Building a Machine Learning Orchestration Framework on Mesos

Framework Messages12

Page 14: Meson: Building a Machine Learning Orchestration Framework on Mesos

Multi Tenancy13

• Resource Attributes • spark.mesos.constraints

Page 15: Meson: Building a Machine Learning Orchestration Framework on Mesos

Cluster Management14

• Red-Black software updates

• Scale up/Scale down

Page 16: Meson: Building a Machine Learning Orchestration Framework on Mesos

Mesos Cluster15

• 100s of Concurrent Jobs

• 700 Nodes

• 5000 Cores

• 25 TB Memory

• Apps: Meson Workflow System, Spark and Dockers

• Few smaller clusters

Page 17: Meson: Building a Machine Learning Orchestration Framework on Mesos

What's Next16

• Fenzo Scheduler - https://github.com/Netflix/Fenzo• Bin Packing, Auto Scaling, Host Attributes/Constraints, Groups, etc

• Cook Scheduler - https://github.com/twosigma/Cook• Multi tenant Spark Scheduler

• Open Source Meson Workflow System

Page 18: Meson: Building a Machine Learning Orchestration Framework on Mesos

17

Antony Arokiasamy

Kedar Sadekar

@aasamy

/aasamy

[email protected]

@kedar_sadekar

/kedar-sadekar

[email protected]