Upload
archie-cowan
View
151
Download
0
Tags:
Embed Size (px)
DESCRIPTION
Practices that enabled ITHAKA's engineering team to increase its change velocity in production from 12 releases per year to over 70 per week on average.
Citation preview
FROM 12 TO 3500DEPLOYMENTS PER YEARHOW ITHAKA INCREASED ITS VELOCITY TO 70 DEPLOYMENTS
PER WEEK IN PRODUCTIONby @archiecowan
ABOUT JSTOR ANDITHAKA
JSTOR is a digital library of more than 2,000academic journals, books, and primary sources.
JSTOR helps people discover, use, and buildupon a wide range of content through a powerful
research and teaching platform, and preservesthis content for future generations. JSTOR is part
of ITHAKA, a not-for-profit organization thatalso includes Ithaka S+R and Portico.
ABOUT ARCHIEWorking in tech since 2002Held titles like
DeveloperQuality Assurance EngineerSoftware EngineerUI DeveloperLead Software EngineerOperationsTechnical ArchitectEnterprise Architect
Enjoy working on ...Distributed SystemsHigh AvailbilityAnalyticsWebsitesCloudy CloudsGetting stuff done
WHEN DOES YOUR USERREALIZE VALUE FROM
YOUR PRODUCT?
WHEN THEY CAN USE IT?
WHEN IT'S INPRODUCTION!
1. Users use production2. The sooner you get changes to production, the sooner you
know you did the right thing for your user.3. The more agile and safe you are with changes in production,
the better you can serve your user.
ITHAKA CARES DEEPLYABOUT SOFTWARE
QUALITY1. All production releases must be rigorously tested2. Releases must not require outages3. User/Publisher impacts must be minimized
This requires a lot of time
OUR OLD RELEASEPROCESS
1 week, product team plans lots of changes for a release (scopeis frozen after this)1-2 months, build lots of changes into a release2 weeks, code freeze and manual regression testing2-3 days, code fixes, rollback testing1-2 days, lots of meetings (not the whole time, but schedulesare complicated)4-8 hours, tell publishers they can't do anything for 4-8 hourson release day1-4 hours, ops people get up really early and begin restarts, 1-4hours2-4 hours, qa tests, 2 - 4 hours
3 MONTHS TO DELIVERCHANGES
Additional requests are deferred to the next release periodThat could be as long as a 6 month wait depending on timingEven tiny changes are expensive with this process, they alsohave to wait
WE WANTED THIS TO BEBETTER
WITHOUT GIVING UP OUR QUALITYASPIRATIONS
ITHAKA CARES DEEPLYABOUT SOFTWARE
QUALITY1. All production releases must be rigorously tested2. Releases must not require outages3. User/Publisher impacts must be minimized4. Process changes must increase iteration speed5. Flexibility is more important than features
INFRASTRUCTUREPrivately managed, artisanal machines in colosInstances AWS Public Cloud, configuration managementunder source control
AGGRESSIVEAUTOMATION OF THE
APPLICATIONDEPLOYMENT PIPELINE
Developers open Tickets for OperationsAn "Operations" team creates resuable services fordevelopment - A private PaaS
$ # multi-datacenter release, $ # load balancer update, $ # provision new instances,$ # instance configuration,$ # cdn configuration$ git push prod master
Because this is so easy, peopleoutside engineering are beginningto use the deployment platform to
deploy apps
RELEASE PLANNINGA week of planning for a large batch of stories released toproduction quarterlyEvery story is planned with the expectation that it will beindividually deployed to production
TEST AUTOMATIONManual TestingEvery story increases the size of our unit and integration testsuites
APPLICATIONARCHITECTURE
Monolithic ArchitectureMicroservices Architecture
LOAD TESTINGSynthetic Load TestingSynthetic Load Testing, Log Replay, Gradual Rollout
DEPLOYMENT PROCESSChanges impact all users at the same timeGradual Rollout to small user groups, 5-10% at a time
INTEGRATIONENVIRONMENTS
Multiple developer, test and integration, environmentsA single TEST environment and app developers use FeatureFlags to mask work in progress
ONCALL DUTY2-3 people on level 1 pager duty for others software50+ people on level 1 pager duty for the software they write
LARGE CHANGES TOPRODUCTION
Lots of coordination between teams, complex changesequences to change productionAll SLAs require backward compatibility processes, decoupledteams, coordination not required to change production
LEAD TIME FOR CHANGES3-6 months for small or large changesas little as 1-7 days for changes
DEPLOYMENT RATE12 deployments per yearover 70 deployments per week
TAKEAWAYSIntroduce changes to your team as you would to production: 1change at a timeOur experience is not the result of blind application of bestpractices - it's about retrospecting and iteration