50
1 Dev/QA/Ops Friendly Docker Pipeline Chris Mague / Shokunin 12/13/2016

Docker pipelines

Embed Size (px)

Citation preview

Page 1: Docker pipelines

1

Dev/QA/Ops Friendly Docker Pipeline

Chris Mague / Shokunin

12/13/2016

Page 2: Docker pipelines

2

Today's Talk

The Goal The Problem The Stack The Process The Conclusion

Page 3: Docker pipelines

3

Quote

“a problem well put is half solved.” ― John Dewey

Page 4: Docker pipelines

4

The Goal

“We want to release more frequently”

Page 5: Docker pipelines

5

The Goal – Restated as Solvable

Build a continuous delivery pipeline for the Trulia Mobile API that is usable for all stakeholders.

Page 6: Docker pipelines

6

The Problem(s) – Dev Version

- my code works on the shared dev host, but not on prod- no real visibility into what is happening in prod- troubleshooting is difficult- the Ops team is not helpful

Page 7: Docker pipelines

7

The Problem(s) – QA Version

- code tested in QA doesn’t work in prod- inability to test multiple builds at the same time- no shared language to bridge the Dev/Ops teams- the Ops team is not helpful

Page 8: Docker pipelines

8

The Problem(s) – Ops Version

- Dev/Stage environments are inconsistent- Prod environment is un-reproducable- Files are copied around in prod- Incoming requests are difficult to parse

Page 9: Docker pipelines

9

The Problems – Stated as Solvable

- Need to build a common language (culture)- Need to build a reproducable platform in all environments (tech)- Need to provide automation and visibility tools (tech/culture)

Page 10: Docker pipelines

10

The Stack

Page 11: Docker pipelines

11

Docker

- Build a reproducable/immutable(ish) platform- Control Application dependencies- Automated build capabilities- Low overhead compared to virtualization- Stateless application

Page 12: Docker pipelines

12

Step 1 / Base Image

- Packer instead of Dockerfiles- Puppet to build container- Build on Jenkins- Vagrant option available- Tagged with latest- Pushed to our Docker registry

Page 13: Docker pipelines

13

Step 2 / Develop Locally

- create separate run directories per environment

- modules per environment- consul_shared

Page 14: Docker pipelines

14

Local Terraform

- Sets up the docker container- Sources variable- calls the shared keys- uses the run_locatoin

Page 15: Docker pipelines

15

Run Location

- list of containers- mobileapi-base only is not

cached

Page 16: Docker pipelines

16

Run Location

- Run supervisor- expose port 80 as 8080- link to dependencies- set env vars- mount volumes

Page 17: Docker pipelines

17

Configuration

- done in consul- consul template to json- creates

/etc/trulia/<APPNAME>.json- separated by environment

Page 18: Docker pipelines

18

Running

Page 19: Docker pipelines

19

Step 3 / Kickoff

Page 20: Docker pipelines

20

An aside on Jenkins

- Configure with Puppet- Install SCM Sync Plugin- Vanilla as possible- Configure with Puppet

Page 21: Docker pipelines

21

${BUILD_NUMBER}

Jenkins provides several environment variables and the build number of the software packaging now becomes our shared key

Page 22: Docker pipelines

22

Communication

QA to Dev - “tcd-mobileapi(container) build 12 failed to pass smoke tests can you please look at class foo”

QA to Ops - “tcd-mobileapi(container) build 12 went is having trouble connecting to the user database”

Ops to Dev - “after we rolled out tcd-mobileapi(container) build 12 we noticed the app_v1_userlookup(KPI) time doubled”

Page 23: Docker pipelines

23

Pipeline - Package Software

- Spin up a build container- Mount the current directory- Pull in dependencies- Build a .deb with FPM- Push to aptly

Page 24: Docker pipelines

24

Page 25: Docker pipelines

25

Pipeline – Build Deployable Container

- Take base container- Install packaged software- Tag with build number- Upload to registry

Page 26: Docker pipelines

26

Docker tags

Be SUPER careful with latest

When in doubt do not use

Page 27: Docker pipelines

27

Pipeline – Run in QATCD

- Spin up container in our QATCD Nomad cluster- Run terraform to update all of the configurations in consul- Set up credentials using Vault- container is now available http://tcd-mobileapi-10.qatcd.example.com

Page 28: Docker pipelines

28

Pipeline – Deploy Test

- health checks are crucial - needed for monitoring - needed for LB - needed for consul - get hit like 20 times/second- engineer came up with the idea of

deploy tests - only hit occasionally - more detailed - more resource heavy

Page 29: Docker pipelines

29

Pipeline – Smoke test

- Calls another Jenkins server- Managed by the QA team- Detailed application level test

Page 30: Docker pipelines

30

Pipeline - Repointer

- allows for static hostnames for applications or external testers

- does some checking

Page 31: Docker pipelines

31

Pipeline – Next Steps

1) Preprod environment - Push configuration LIVE - Run a single container with the newer version - Other tests run - Build number is put in a Jenkins form and push button2) Release to Production - Put a build number in a Jenkins form - Only allowed if the build is on preprod - Containers are rolled out with sleep and concurrency set

Page 32: Docker pipelines

32

Page 33: Docker pipelines

33

Pipeline

Dev, QA and Ops teams keep an eye on KPIs and various dashboards

QED

Page 34: Docker pipelines

34

Internals

Page 35: Docker pipelines

35

Nomad

- Job scheduler- Not limited to Docker- Integrates with Consul- Easy setup- Sane configuration

Page 36: Docker pipelines

36

Nomad Config

Page 37: Docker pipelines

37

Traefik

- HAProxy restart issue- Performant- Easily templatable

configuration- Nice quick front end

Page 38: Docker pipelines

38

Page 39: Docker pipelines

39

Vault / Consul Template

- Easily generate config files from key/value store- Feature flags are easily implimented- Store and filter Database credentials

Page 40: Docker pipelines

40

Logging

- Big challenge- All Apache/Nginx logs include APPNAME/BUILD_NUMBER

information and are in JSON format- Application logs are in JSON format and often include unique

IDs- Stacktraces are fingerprinted- Logstash picks up from the Nomad alloc dirs

Page 41: Docker pipelines

41

Page 42: Docker pipelines

42

Page 43: Docker pipelines

43

Page 44: Docker pipelines

44

Stats / KPIs

- Data is pulled from the logs and sent to statsd→influxdb with a Grafana front end

- Host and container level stats are picked up via cAdvisor

Page 45: Docker pipelines

45

Page 46: Docker pipelines

46

Page 47: Docker pipelines

47

Troubleshooting

- Devs have exec access to all containers through Vault SSH

- This is audited- After completion of any activities the container is

terminated

Page 48: Docker pipelines

48

No silver bullets...

- Unit tests are slow- Initial learning curve- Docker on anything other than Linux is painful- Apps need to be modified- Less control for devs compared to old method

Page 49: Docker pipelines

49

Improvements

- Better troubleshooting tools- Shared docker host for apps with heavy upstream dependencies- More local services to make development easier- Better training/support for desktop Docker issues- More code libraries to handle common app issues

Page 50: Docker pipelines

50

Thanks

Kevin - AppDynamics Sonal Joshi – Trulia Sr. Automation Engineer

Vincent Lam – Trulia Sr. Application Developer