25

Push jobs: an orchestration building block for private Chef

Embed Size (px)

DESCRIPTION

Push jobs is a new feature in Opscode Private Chef that will allow a user to run commands across hundreds of chef managed servers. Push Jobs leverages Erlang/OTP and ZeroMQ to provide scalable and fault tolerant execution. In this talk I’ll cover the general motivation behind the design and an architectural overview of the system. This will include details of we used Erlang and ZeroMQ to build a robust, scalable system. I’ll also do a demo of the push job feature in action, covering the push jobs server, execution client and knife command line interface.

Citation preview

Page 1: Push jobs: an orchestration building block for private Chef
Page 2: Push jobs: an orchestration building block for private Chef

The Opscode Push Jobs ServiceMark AndersonApril 28, 2013

Page 3: Push jobs: an orchestration building block for private Chef

Push jobs in a command line

•knife job start -quorum 90% 'chef-client' --search 'role:webapp' • Finds all nodes with role webapp• Submits a job with quorum of 90% to the pushy server.

• Checks quorum• Starts job on available nodes• Gathers success and failures

• And will do this for ten nodes...or a thousand

Page 4: Push jobs: an orchestration building block for private Chef

Push jobs Why not use X?

• We wanted to build a tool that could be deeply integrated into chef.• Integrated with authentication model

• Clients use their client key to authenticate to the server• Users use their keys to send commands to the api

• Integrated with the authorization model• Groups control access now• Eventually there will be fine grained ACLs

• Integrated with search and other Chef features• Scalability

Page 5: Push jobs: an orchestration building block for private Chef

Push jobs Server

• Erlang service• Extends the Chef REST API

• Job creation and tracking• Push client configuration

• Controls the clients via ZeroMQ• Heartbeating to track node availability• Command execution

Page 6: Push jobs: an orchestration building block for private Chef

Push jobs Client

• Simple ruby client• Receives heartbeats from the server• Sends back heartbeats to the server• Executes commands

• Configuration requirements are minimal• The client initiates all connections to the server

• Most configuration is via chef API call using the client key• Opens ZeroMQ connections to server for all other communication

Page 7: Push jobs: an orchestration building block for private Chef

Push jobs The lifecycle of a job

Server

Client

Job Accepted

Send Command

Clients ACK

Wait for Quorum Start Exec

Clients Exec

Collect Results

Page 8: Push jobs: an orchestration building block for private Chef

Push jobs Knife extension

• All control for pushy jobs is via extensions to the chef API• Node status• Job control

• start• stop• status

• Job listing

Page 9: Push jobs: an orchestration building block for private Chef

Pushy Demo

Page 10: Push jobs: an orchestration building block for private Chef

Push JobsDemo

Chef/Pushy Server

Chico

Harpo

Groucho

Gummo

Zeppo

Page 11: Push jobs: an orchestration building block for private Chef

The nitty gritty

Page 12: Push jobs: an orchestration building block for private Chef

Internals:Client server interaction

• The client initiates all connections to the server• The client authenticates to the server and receives

• A session key and TTL• ZeroMQ connection information (ports, heartbeat rate, etc)

• Subscribes via ZeroMQ to server heartbeats (1 to many)• Connects via ZeroMQ to the server (1-1)

• Sends heartbeats to the server as long as it receives server heartbeats • Awaits commands from the server

Page 13: Push jobs: an orchestration building block for private Chef

Security

• Protocol security• We leverage the existing API signing mechanism to exchange session keys• All ZeroMQ messages are signed

• HMAC SHA256 signing protocol protects point to point messages• RSA 2048/SHA1 protects broadcast messages (just like the chef API)

• Relies on the SSL chain of trust to the server.

Page 14: Push jobs: an orchestration building block for private Chef

Access control

• Access rights controlled by groups• ‘push_job_writers’ group controls job creation and deletion• ‘push_job_readers’ group controls read access to job status and results

• Whitelist for commands• The client rejects commands that aren’t on the whitelist

• In the future we’d like to do finer grained access control• Perhaps persistent job templates with their own access rights and commands

Page 15: Push jobs: an orchestration building block for private Chef

Implementation

Page 16: Push jobs: an orchestration building block for private Chef

Why erlang?

• Chef 11 work has been very successful • Easier integration• The process (think threads) model allows a great deal of parallelism

• Every node has an erlang process to track its state• Every job has a process to track its state

Page 17: Push jobs: an orchestration building block for private Chef

Server process structure

Message switch

Heartbeat generator

REST API

Clients

Job Monitor

Job Monitor

Job Monitor

Job Monitor

Job Monitor

Client Monitor

Client Monitor

Client Monitor

Client Monitor

Client Monitor

Page 18: Push jobs: an orchestration building block for private Chef

Why ZeroMQ?

• Abstracts away much of the pain of socket libraries• Reliable delivery• Portable• Broad language support• Proven scalability

Page 19: Push jobs: an orchestration building block for private Chef

Some impedance mismatch

• ZeroMQ provides a lot of goodness for asynchronous execution• That is really helpful in many languages• Erlang doesn’t need that so much, and encourages finer grained tasks

• ZeroMQ hides a lot of interesting state• It turns out we care about whether a node dies and comes back

• Need a better ZeroMQ/Erlang glue library• ZeroMQ 3 offers some very interesting options for future work

Page 20: Push jobs: an orchestration building block for private Chef

Performance and scalability results

• We can run a job over 2000 nodes• 15 sec heartbeats• c1.medium

• Bottlenecks• Heartbeats consume a lot of resources• Everything goes through router process for zeromq messages

Page 21: Push jobs: an orchestration building block for private Chef

Moving towards a federated architecture

Chef API

SQL

Erchef

Solr SQL

Push Reporting

SQLSQL

Auth

Page 22: Push jobs: an orchestration building block for private Chef

Where are we now?

Page 23: Push jobs: an orchestration building block for private Chef

Availability: Limited release

• Time to get people using it• Private Chef only for now• Hosted Chef deferred

• Scalability: hosted has more than 2k nodes• Security: ZeroMQ messages aren’t encrypted

• Open Source: Eventually

Page 24: Push jobs: an orchestration building block for private Chef

Future directions

• Scalability improvements• Our biggest customers will want more

• Jobs as first class, persistent objects• More control of execution of jobs,

• Inside a job: (push jobs makes an excellent DDOS tool)• Between jobs: Chaining results between jobs

• Push jobs as a building block• Conducting experiments using it for building block for continuous delivery

Page 25: Push jobs: an orchestration building block for private Chef

Questions?