Transcript
Page 1: Scaling your Jenkins CI pipeline

Scale yourJenkins pipeline

FOSDEM 2013Anders Nickelsen, PhD

QA engineer @ Tradeshift@anickelsen, [email protected]

Page 2: Scaling your Jenkins CI pipeline
Page 3: Scaling your Jenkins CI pipeline
Page 4: Scaling your Jenkins CI pipeline

Tradeshift

Page 5: Scaling your Jenkins CI pipeline

FOSDEM 2013

Tradeshift

Platform for your business interactionsCore services are free

~3 years old

One product – one production environment

~20 developersCopenhagen, DK and San Francisco, CA

Feb 2, 2013

Page 6: Scaling your Jenkins CI pipeline
Page 7: Scaling your Jenkins CI pipeline

Slow test is slow

Page 8: Scaling your Jenkins CI pipeline

FOSDEM 2013

Why scale? Fast feedback

Integration tests (IT): Selenium 2 from Geb (Java/Groovy)Run by Jenkins on every change

Takes 2 hours to run all in sequence

10+ team branches, each trigger ITVerified merges to master, also trigger IT

Pipeline is congestion pointFeedback not fast enough

Feb 2, 2013

Page 9: Scaling your Jenkins CI pipeline

FOSDEM 2013

Release pipeline

Feb 2, 2013

Master

10 team branches

Integration tests on changes

Any branch(unit tested)

= 10 projects

Production

Page 10: Scaling your Jenkins CI pipeline

The swarm

Page 11: Scaling your Jenkins CI pipeline

© Blizzard 2013

Page 12: Scaling your Jenkins CI pipeline

FOSDEM 2013

What?

12 Jenkins slaves– New slaves join swarm on boot– Orchestrated by ec2-collective– Configured by Puppet at boot

Tests distributed in setsLongest processing timeUses test run-time from last stable build

1 set per online slave (dynamic)Feb 2, 2013

Page 13: Scaling your Jenkins CI pipeline
Page 14: Scaling your Jenkins CI pipeline

FOSDEM 2013

Fast tests!

All sets are put into Jenkins build queuePicked up by any slave– throttled to one per slave

12 slaves => 20 minutes– Tests: 10 min– Overhead: 10 min

24 slaves => 15 minutes

Feb 2, 2013

Page 15: Scaling your Jenkins CI pipeline

FOSDEM 2013

Post processing

Join when all sets complete– Triggered builds are blocking

Collect test results– JUnit, HTML, screenshots, videos, console logs =

500 MB/run– Curl request to Jenkins API

Process test results– JVM outputs, slice console logs, archive artifacts– Custom groovy script

Feb 2, 2013

Page 16: Scaling your Jenkins CI pipeline

Optimizations

Page 17: Scaling your Jenkins CI pipeline

FOSDEM 2013

Optimizations

Only on ITCase file-level, file scan

Only on spec level–min time = longest running spec

Only scale test processing time– 10 minutes today

Feb 2, 2013

Page 18: Scaling your Jenkins CI pipeline

Lessons learned

Page 19: Scaling your Jenkins CI pipeline

FOSDEM 2013

Parallelization overhead

Swarm setup and tear down– node initialization and test result

collection

Tests break a lot when run in parallelFixing tests and code hurts

Feb 2, 2013

Page 20: Scaling your Jenkins CI pipeline

FOSDEM 2013

‘Random failures’

Failures appear probabilistic / random– Order dependencies– Timing issues– Rebuild ‘fixes’ the tests

Slave failures–Weakest link breaks the pipeline– 24 slaves => 1 filled up due to slow

mount => pipeline brokenFeb 2, 2013

Page 21: Scaling your Jenkins CI pipeline
Page 22: Scaling your Jenkins CI pipeline

FOSDEM 2013

Cost optimization

AWS EC2 instances:– 1/2 price gave 3/4 performance– 4/3 swarm size gave 1/4 price reduction– Equivalent performance

Swarm is offline when not used– Kept online for 1 hour after use

Feb 2, 2013

Page 23: Scaling your Jenkins CI pipeline
Page 24: Scaling your Jenkins CI pipeline

FOSDEM 2013

Credits

puppetlabs.comgithub.com/andersdyekjaerhansen/ec2_collective

Jenkins and plugins– Swarm plugin, parameterized trigger, envinject, rebuild,

join, throttle concurrent build, conditional build step

More details of our setup– tradeshift.com/blog/just-add-servers/

We’re hiring!– tradeshift.com/jobs

Feb 2, 2013


Recommended