Availability
In the name of ALLAH
Software Availability, By
Reza Same'eBy
SOFTWARE DEVELOPER @ BISPHONE
AtZCONF - 6th
Sep 2015 | Shahrivar 1394
Availability
Resiliency
Resiliency Availability ?
Available :
Present or Ready
for Immediate use
Available :
Resiliency Availability ?
As far as they ( USERS ) know, when the response time
exceeds their expectation,
the system is down.
Resiliency Availability ?
Available :
Ready for Immediate use
and React Quickly
Available :
Resiliency How to Measure ?
https://en.wikipedia.org/wiki/High_availability
Availability = MTTF / ( MTTF + MTTR )Availability = MTTF / ( MTTF + MTTR )
Mean Time To Failure = MTTFMean Time To Recovery = MTTR
MTTF ~= UptimeMTTR ~= Downtime
Availability ( percent ) Downtime
99.9999999 “Nine Nine” Less than 32 ms in year
99.99999 “Seven Nine” About 3 sec in year
99.999 “Five Nine” About 5 min in year
99.9 "Three Nine" About 9 hours in year
Resiliency Why Matters ?
- Critical Systems ( Health-Care , … )
- Business
- Our Quality of Life (^_^)
398/Second34.4 Millions Items
http://www.forbes.com/sites/ryanmac/2015/07/16/amazon-says-prime-day-was-huge-success-and-vows-to-repeat-it-despite-customer-criticism/
http://www.forbes.com/sites/ilyapozin/2013/10/17/industry-to-watch-in-2014-healthcare-tech/
Resiliency Problem ?
Failures Are Everywhere !Failures Are Everywhere !
Resiliency Solution ?
You Can't Prevent Failures
… then You Should Manage Them.
You Can't Prevent Failures
… then You Should Manage Them.
- ?
Reduce MTTRReduce MTTRAvailability = MTTF / ( MTTF + MTTR )
Resiliency Reactive Manifesto ?
Reactive Manifesto
ResponsiveReact To Users
Message DrivenModules / Components Interaction
ResilientReact To Failures
ElasticReact To Load
Responsive
ElasticResilient
Message Driven
Reactive ManifestoGoal
PrinciplePrinciple
Method
http://www.reactivemanifesto.org/
Resiliency Reactive Manifesto and Availability ?
Available =
Responsive + Resilient
Availability
Depends On Resiliecy
Availability
Depends On Resiliecy
Availability
Depends On Resiliecy
Available =
Responsive + Resilient
Resiliency Resiliency
Resiliency means
React to Failures
A resilient system keeps processing transactions, even
when there are transient impulses, persistent stresses, or
component failures disrupting normal processing. This is
what most people mean when they just say stability.
Resiliency means
React to Failures
Resiliency Resiliency
Design For Resiliency in Real World
Resiliency Resiliency
Isolation
Communication
Failures
Isolation Over Functionality & Failures +
Abstraction Over Accessibility
Resiliency Isolation: Functionality, Resources & State
Functionality, Resources, State
Isolation
Single Responsibility -
Share Nothing -
Stateless -
Eventual Consistency & Idempotency -
… -
Resiliency Isolation: Functionality, Resources & State
BULK HEADBULK HEAD Isolation &
Redundancy
Isolation Over Failure+
Prevent Chain of Failure
https://en.wikipedia.org/wiki/Compartment_(ship)
Resiliency Communication
Location TransparencyLocation Transparency
DNS, Load Balancers, Message Brokers
Face 2 Face vs. Phone/Email
"Where" != "Who"
Resiliency Communication
Avoid Call StackAvoid Call Stack
Resiliency Communication
Async – 1 : Event DrivenAsync – 1 : Event Driven
Concurrency isn't Easy !Break Isolation (State,Resources-Context)
Isolation Over FailureLock-Free
Resiliency Communication
Async – 2 : Message DrivenAsync – 2 : Message Driven
The Big Idea is “Messaging” – Alan Kay
vs.
Resiliency Communication
Async – 2 : Message DrivenAsync – 2 : Message Driven
The Big Idea is “Messaging” – Alan Kay
Lock-free & Non-BlockingLead to Elasticity ( Scalability )ThrottellingLocation TransparencyIsolation Over FailureShare Nothing & Bulk HeadConcurrency is Easy !Very Flexible
*
Resiliency Communication
Avoid Unlimited Resources
Strict And Bug-Free API
Use Timeout
...
AND MORE …AND MORE …
Avoid Unlimited Resources
Strict And Bug-Free API
Use Timeout
...
Resiliency Failure
In Resilient System
Failures Are First-Class
In Resilient System
Failures Are First-Class
( Fault Tolerancy )
Resiliency Failure
Fail Fast : Immediate & Visible
before ... Desecrating State
& Being ZOMBIE :(
http://bond.trendolizer.com/2015/01/how-to-jump-out-of-a-moving-car-and-survive.html
Crash Safely
Fail Fast : Immediate & Visible
Resiliency Failure
http://bond.trendolizer.com/2015/01/how-to-jump-out-of-a-moving-car-and-survive.html
Circuit Breaker - Strict API -
Avoid Default Values - Timeout -
Shed Load - … -
Fail Fast : Immediate & VisibleFail Fast : Immediate & Visible
Circuit Breaker - Strict API -
Avoid Default Values - Timeout -
Shed Load - … -
Resiliency Failure
Ciruite BreakerCiruite Breaker
A little Fail-Fast
*
Resiliency Failure
SupervisorSupervisor
Supervise ME :)Supervise ME :)
Monitor
Restart
Stop
Escalate
Monitor
Restart
Stop
Escalate
http://www.topdreamer.com/funny-cute-baby-faces-photos/
Resiliency Failure
Error KernelError Kernel
Resiliency Failure
Error KernelError Kernel
FAILED
Resiliency Failure
Error KernelError Kernel
FAILED
Resiliency Failure
Error KernelError Kernel
RESTARTED
Resiliency Failure
Error KernelError Kernel
RESTARTED
One-For-One
All-For-One
Resiliency Failure
http://askatoddler.com/redundancy/
RedundancyRedundancy
Resiliency And More ...
Test
- Platform, Tools & Framework
- Pull The Plug
Test
http://blog.mmeconsulting.com/a-simple-but-costly-mistake/man-yanking-electrical-cord/
Resiliency And More ...
PlatformPlatform
- Experience
- Maturity & Tools
- Platform Dependent: GC , ...
*
Resiliency And More ...
UI & UX
- Hide Failures
- Consistency
Resiliency Summary
- Isolation is the first step of Resiliency
- Better Isolation By “Async" Communication
- Manage Failures by Fail-Fast and Supervisor
- Hide Failures in UI & UX
Resiliency GoodLuck (^_^)
Your Quality of life after release 1.0 depends on
choices you make long before that vital milestone.
People's Quality Of Life Depends On Our Choices
– Me :)
Resiliency GoodLuck (^_^)
- Question ?
- Thanks :)Reza Same'eSOFTWARE DEVELOPER @ BISPHONE
< [email protected] | http://samee.blog.ir | @reza_samee >
Interested In Scala, Functional and Reactive
We Are Hiring!If You are interested in Scala / Java or Erlang,
Let we know: [email protected]