25
REA-Shipper Luke Carter-Key [email protected] @lukecarterkey

Rea-Shipper Infracoders Jan 2015

Embed Size (px)

Citation preview

REA-ShipperLuke Carter-Key

[email protected]

@lukecarterkey

Current approach

• Use Packer to bake an Amazon Machine Image

(AMI) for each build

• Copy AMI to each region we want to use it in

• Deploy EC2 instances that use this AMI

Good

• Immutable servers are easy to work with

• Fail at build time, instead of deploy time

• Fast boot time

• Record of every build

Not so good

• Building AMI is slow (5-10 mins each build)

• Copying an AMI between regions is slow (5ish minutes)

• Managing different AMI IDs for each region sucks

• The EBS snapshot asociated with an AMI can be big. Adds up over

time

• Lots of stuff that supports the app but isn’t core

• Supporting stuff done differently by different people over time -

becoming hard to maintain

• Patching vulnerabilities really sucks

Use case

• Web apps / microservices only

• Exposed on port 80

• Alerting and monitoring requirements don’t vary

that much (either we alert on a set of things or we

don’t)

Goals

• Make deployment of a new microservice REALLY easy

• Fast build

• Zero downtime deploy

• Fast deploy

• Sensible default config

• Consistent, maintainable supporting infrastructure

• Consistent tagging of infrastructure

• Roll out improvements without making changes to multiple build

pipelines and deploy scripts

REA-Shipper

REA-Shipper

• “Shipper” takes a web application, packaged as a

docker app and deploys it to AWS with zero

downtime along with containers for things we

want for every app like Nginx and log forwarding

• Separates app from the things that exist only to

support it

• Deploys to AWS with sensible infrastructure

App container

• Runs something on port 80

• Logs to stdout and stderr

• No local state

Support containers

• Nginx

• Toggle caching on/off - will respect response

cache control headers

• Set cache size

• Log forwarder

• Specify Splunk index and host

EC2 Instance

EC2 Host AMI

• Minimal Ubuntu 14.04 by default

• Can substitute a different AMI

• Runs Docker

• Runs New Relic system monitoring

• Base Ubuntu Docker image used by most of our

other containers

AWS Infrastructure

• Autoscaling group to host Docker containers

• Elastic Load Balancer

• Security groups (ports 80, 443 and 22)

AWS Infrastructure

Optional extras

• CloudWatch alarms, SNS topic and subscriber to

send alerts to PagerDuty

• IAM role and instance profile to access other AWS

resources

• Route53 DNS record pointing at ELB

• Scheduled scaling actions

• Nginx caching

YAML-based Config

• App name, docker image, env variables

• AWS VPC, subnets, instance type and numbers,

tags, etc

• Splunk, nginx and New Relic options

• Can include other config files to share config

between deployments (eg. VPC, subnets)

Deployment process

• Cloudformation to create base infrastructure

• ELB

• Security Groups

• IAM role and instance profile

• DNS record

Deployment process

• Cloudformation ELB healthcheck bug prevented use of rolling updates

• Use AWS API calls to update existing ASG for changes to min, max,

desired capacity

• Use AWS API calls to create new ASG if anything else is changed

• Create ASG

• Add it to ELB

• Wait til new ASG is healthy

• Remove old ASG from ELB and delete it

• Notify New Relic that deployment happened

Deployment process

• Instance startup

• docker pull Nginx, Splunk and app containers

• docker run app

• docker run nginx

• docker run splunk-forwarder

• docker logs -f so Splunk will catch up with running

app

Current performance

• Building a Docker container is much quicker than

building an AMI

• Deployment typically ~3 minutes with room to go

well below this (we think)

• Biggest delay is creating new instances on

every deploy

Adoption

• Initial use was for dashboards for a couple of

teams

• Now used in production for several high profile

services

• Organic usage growth

What’s next?

• Single Cloudformation stack

• Autoscaling triggered from CloudWatch alarms

• LDAP auth in nginx container

• SSL (client certs)

• Fig for linking containers together

• Open source

• PaaS?

• Other use cases?

Other things we want to do,

maybe

• A-B Deploys using HAProxy Nginx to switch out

which containers get traffic would bring deploy

times down further

• Multi-region support

• Dashboards

Questions?

REA is hiring@lukecarterkey on twitter or careers.realestate.com.au

to find out more