Upload
others
View
4
Download
0
Embed Size (px)
Citation preview
remediation patterns Jez Humble
QCon London 2011
[email protected] @jezhumble #continuousdelivery
1Thursday, March 10, 2011
ITIL: “Recovery to a known state after a failed Change or Release.”
Recovery: “Returning a Configuration Item or an IT Service to a working state.”
Jez: “Fixing shit when it breaks”
remediation
2Thursday, March 10, 2011
http://www.knowledgetransfer.net/dictionary/ITIL/en/Recovery.htmhttp://www.knowledgetransfer.net/dictionary/ITIL/en/Recovery.htmhttp://www.knowledgetransfer.net/dictionary/ITIL/en/Change.htmhttp://www.knowledgetransfer.net/dictionary/ITIL/en/Change.htmhttp://www.knowledgetransfer.net/dictionary/ITIL/en/Release.htmhttp://www.knowledgetransfer.net/dictionary/ITIL/en/Release.htmhttp://www.knowledgetransfer.net/dictionary/ITIL/en/Configuration_Item.htmhttp://www.knowledgetransfer.net/dictionary/ITIL/en/Configuration_Item.htmhttp://www.knowledgetransfer.net/dictionary/ITIL/en/IT_Service.htmhttp://www.knowledgetransfer.net/dictionary/ITIL/en/IT_Service.htm
prevention
patterns for low-risk release
patterns for incremental delivery
strategies for remediation
3Thursday, March 10, 2011
1oz of prevention
4Thursday, March 10, 2011
deployment pipeline
5Thursday, March 10, 2011
testing on production environments
creating maintainable acceptance tests
testing cross-functional requirements
the hard bits
6Thursday, March 10, 2011
automate provisioning and deployment
ensure devs, testers and ops collaborate throughout
reducing release risk
7Thursday, March 10, 2011
canary releasing
8Thursday, March 10, 2011
canary releasing
9Thursday, March 10, 2011
reduce risk of release
multivariant testing
performance testing
canary releasing
10Thursday, March 10, 2011
immune system
what if someone replaced your “buy” button with spacer.gif?
T cells http://www.flickr.com/photos/gehealthcare/3326186490/11Thursday, March 10, 2011
http://www.flickr.com/photos/gehealthcare/3326186490/http://www.flickr.com/photos/gehealthcare/3326186490/
Business metrics - revenue, # orders, # users
Ops metrics - changes, incidents, TTD, TTR, TBF
Technical metrics - TPS, response time, hits
monitoring
http://www.flickr.com/photos/wwarby/3296379139/
12Thursday, March 10, 2011
http://www.flickr.com/photos/wwarby/3296379139/http://www.flickr.com/photos/wwarby/3296379139/
root cause analysis
collboration
data
the hard bits
13Thursday, March 10, 2011
incremental delivery
John Allspaw: “Ops Metametrics” http://slidesha.re/dsSZIr
14Thursday, March 10, 2011
http://slidesha.re/dsSZIrhttp://slidesha.re/dsSZIr
incremental deployments
develop on mainline
feature toggles and branch by abstraction
dark launching
incremental delivery
15Thursday, March 10, 2011
feature toggles
[featureToggles]wobblyFoobars: trueflightyForkHandles: false
Config File
... various UI elements
some.jsp
forkHandle = (featureConfig.isOn(‘flightlyForkHandles)) ? new FlightyForkHander(aCandle) : new ForkHandler(aCandle)
other.java
Stolen from Martin Fowler
16Thursday, March 10, 2011
branch by abstraction
Component A
Component B
Seam
Component A
17Thursday, March 10, 2011
branch by abstraction
Component AComponent A
Component B’
Abstraction layer
Component B
18Thursday, March 10, 2011
incremental deployment
STATIC CONTENT
/static/1.1
/static/1.0
DEPENDENT SERVICE
1.0 1.1
Abstraction layer Abstraction layer
APPLICATION
Database
Router /Load balancer
Interwebs
19Thursday, March 10, 2011
dark launching
20Thursday, March 10, 2011
dark launching
21Thursday, March 10, 2011
How long would it take you to release a change to a single line of code?
Ops metrics - changes, incidents, TTD, TTR, TBF
If your data center blew up, how long would you take to restore service?
measuring effectiveness
22Thursday, March 10, 2011
questionsJez Humble
[email protected] @jezhumble #continuousdelivery
23Thursday, March 10, 2011
mailto:[email protected]:[email protected]