View
4.285
Download
0
Category
Preview:
Citation preview
Availability,the Cloud and
Everything
Joe Williams
Saturday, October 2, 2010
Me
• Joe Williams• Infrastructure Engineer • Cloudant• @williamsjoe• joeandmotorboat.com
Saturday, October 2, 2010
• Distributed database built on CouchDB• Real-time Search and Analytics• Sign Up! (Free to 256MB)• cloudant.com• http://github.com/cloudant/bigcouch
Saturday, October 2, 2010
Bias
• Distributed Databases (CouchDB)• Amazon EC2• Chef• Erlang
Saturday, October 2, 2010
Availability
Saturday, October 2, 2010
Availability
• What is Availability?
Saturday, October 2, 2010
Availability
Saturday, October 2, 2010
Availability
“System availability refers to the accessibility of system services to users. A system is available if it is
operational for an overwhelming fraction of the time. Unlike reliability, availability is instantaneous.”
Saturday, October 2, 2010
Availability
“System reliability refers to the property of tolerating constituent component failures, for the longest time. A
system is perfectly reliable if it never fails.”
Saturday, October 2, 2010
Availability
• Reliability * Availability = Dependability
Saturday, October 2, 2010
Availability
• Availability & Reliability• Mean time to failures• Mean time to repair• Durability• Fault isolation• Fault tolerance
Saturday, October 2, 2010
Availability
• Uptime / Downtime• Perceived• Actual
Saturday, October 2, 2010
Availability
• Probabilistic Risk Assessment• Event Tree Analysis• Fault Tree Analysis
Apthorpe (http://www.usenix.org/events/lisa01/tech/apthorpe/apthorpe.ps)
Saturday, October 2, 2010
The Cloud
Saturday, October 2, 2010
The Cloud
“It never gets easier, you just go faster.”- Greg Lemond
Saturday, October 2, 2010
The Cloud
• Abstraction• Commoditization• Homogenous• Ephemeral
Saturday, October 2, 2010
The Cloud
• Costs• Loss of Control• Single Points of Failure• Network Partitions / Data Locality• Unreliable• Performance
Saturday, October 2, 2010
The Cloud
• Benefits• API to everything• Fast and Flexible Resource Mgmt• “Unlimited” Resources
Saturday, October 2, 2010
The Cloud
• Bootstrapping• Time and Effort
Adam Jacob and Ezra Zygmuntowicz (http://blip.tv/file/2285124/)
Saturday, October 2, 2010
The Cloud
• Nodes are stateless and disposable.
Saturday, October 2, 2010
The Cloud
"Clouds are systems ... and with systems, you have to think hard and know how to deal with issues in that environment. The scale is so much bigger, and you don't have the physical control. But we think people should
be optimistic about what we can do here. If we are clever about deploying cloud computing with a clear-eyed notion of what the risk models are, maybe we can actually save the economy through technology."
- Security in the Ether By David Talbot - MIT Technology Review Jan/Feb 2010
Saturday, October 2, 2010
What’s Next
• Distributed Systems• Automation• Data Driven Operations
Saturday, October 2, 2010
Distributed Systems
Baran (http://www.rand.org/pubs/research_memoranda/RM3420/)
Saturday, October 2, 2010
Distributed Systems
• RAID ain’t as redundant as it used to be.
Leventhal (http://queue.acm.org/detail.cfm?id=1670144)
Saturday, October 2, 2010
Distributed Systems
• Redundancy• Duplication• Distribution
Saturday, October 2, 2010
Distributed Systems
• Alphabet Soup• ACID, CAP, BASE, 2PC, MVCC• Vector Clocks, Eventual Consistency• Dynamo, Paxos, Chandra, Byzantine
Saturday, October 2, 2010
Distributed Systems
• CAP == Availability
Saturday, October 2, 2010
Distributed Systems
• Erlang• Distributed• Concurrent• Fault Tolerant
Saturday, October 2, 2010
Distributed Systems
• Erlang• Supervision Trees
Saturday, October 2, 2010
Distributed Systems
• Erlang• Hot Code Upgrades• Distributed Upgrades are HARD
Saturday, October 2, 2010
Distributed Systems
• Future Work
• Erlang Supervision Trees
• PRA / FTA / ETA
Apthorpe (http://www.usenix.org/events/lisa01/tech/apthorpe/apthorpe.ps)
Saturday, October 2, 2010
Automation
Saturday, October 2, 2010
Automation
• Optimal use of the cloud.
Saturday, October 2, 2010
Automation
• Frequent deployment.
Saturday, October 2, 2010
Automation
• Tools• Chef• Puppet• Cfengine• Bcfg2
Saturday, October 2, 2010
Automation
• Erlang + Chef (as of v0.8)• erl_call Provider
Saturday, October 2, 2010
Data Driven Operations
Saturday, October 2, 2010
Data Driven Operations
“What gets measured, gets managed.”-Peter Drucker
Saturday, October 2, 2010
Data Driven Operations
• Instrumentation
Saturday, October 2, 2010
Data Driven Operations
• Logging
Saturday, October 2, 2010
Data Driven Operations
• Visualization
Saturday, October 2, 2010
Data Driven Operations
• Demo!
Saturday, October 2, 2010
Data Driven Operations
• Modeling
• Analysis
• Universal Law of Computational Scalability
• Amdahl’s Law
Saturday, October 2, 2010
Data Driven Operations
• Modeling isn’t just for capacity planning.
Montagne (http://queue.acm.org/detail.cfm?id=1862187)
Saturday, October 2, 2010
The End
Saturday, October 2, 2010
Questions?
Joe Williams - @williamsjoe
Saturday, October 2, 2010
Recommended