7
LEVERAGING AUTOMATION TO SCALE JIM HOPP, LOOKOUT, INC.

Leveraging Automation to Scale

  • Upload
    lookout

  • View
    625

  • Download
    2

Embed Size (px)

DESCRIPTION

Leveraging Automation to Scale was originally presented at Lookout's Scaling for Mobile event on July 25, 2013. Jim Hopp is an Operations Architectr at Lookout, Inc. Jim's talk focused on moving quickly with a small organization—and using automation to do so. Lookout has grown immensely in the last year. We've doubled the size of the company—added more than 80 engineers to the team, support 45+ million users, have over 1000 machines in production, see over 125,000 QPS and more than 2.6 billion requests/month. Our analysts use Hadoop, Hive, and MySQL to interactively manipulate multibillion row tables. With that, there are bound to be some growing pains and lessons learned.

Citation preview

Page 1: Leveraging Automation to Scale

LEVERAGING AUTOMATION TO SCALE

JIM HOPP, LOOKOUT, INC.

Page 2: Leveraging Automation to Scale

HOW TO KEEP UP?

• Increasing number of services

• And desire for faster cycle time

• Increasing number of machines

• Frequent deploys (and getting more frequent)

• Engineering growing much faster than Ops

Page 3: Leveraging Automation to Scale

(TESTING?!)

• Important that provisioning chgs follow modern dev pipeline

• rspec (chefspec) for chef

• minitest for chef’d nodes

• code reviews

• deploys from jenkins

• continually re-provision bare-metal test nodes

Page 4: Leveraging Automation to Scale

GET OTHERS TO DO THE WORK

• We want engineers to write the chef cookbooks and cap config files for their service

• We want them to use the production tools during the development cycle

• Originally built vagrant-based tools. Worked well but fragile and didn’t scale past 2-3 VMs

• Developed TestLab for managing Linux containers. Works great! https://github.com/zpatten/testlab

Page 5: Leveraging Automation to Scale

MAKE IT SIMPLE TO USE

• Built Chef LWRPs to simplify configuration

• This is all that’s necessary for a normal service:

• LWRPs for resque, firewall, SSL certs, etc.

• Libraries for finding IPs, data bags within environments, etc.

• Office hours, pairing

lookout_passenger_app 'my_service' do rvm_gemset 'ruby-1.9.3-p448@my_service'end

Page 6: Leveraging Automation to Scale

DEPLOYS

• chef-repo deployed to prod on merge to master by jenkins

• Deploy HEAD

• For services, moving to developer deploy

• They know their service better than Ops

• Still figuring out migrations

• Tooling for roll-back