How DreamHost builds a public cloud with OpenStack.pdf

Preview:

Citation preview

How DreamHost builds a Public Cloud with OpenStack

Carl Perry <carl.perry@dreamhost.com>twitter/github/slideshare:edolnx irc:carlp@freenode

Well Hello!• I’m the Cloud Architect at DreamHost

• We’ve been around since 1997

• We’re old enough to drive next year!

• Spun off Inktank as a support company for Ceph last year

• Launched DreamObjects, a Ceph based S3 alternative, in September

• This week we Launched DreamCompute, our Public Cloud

Why?http://www.flickr.com/photos/toywhirl/8050771631/

“To empowerentrepreneurs

and developers”

Design Tenants

• Design for Reliability

• Maintenance is the norm not the exception

• Isolate tenants from each other by default

• Modular equipment design

• Easy to expand

• Easy to upgrade

• Automate Everything

Considerations

• Scalability

• Speed

• Monitoring

• Uptime

• Security

• Cost

Obstacleshttp://www.flickr.com/photos/brewbooks/4206976341/

Storage

• Must be shared, local storage prevents maintenance

• Has to be cost effective

• Has to be massively scalable

• Must run on commodity hardware

• Single solution for boot and additional volumes

• Fully Automat-able

Networking

• Must support IPv6

• Tenants must be isolated from each other

• Cannot be limited to a physical location within a data center

• 10Gb, lots of 10Gb

• No single point of failure (core switches are so 1980)

Hypervisor

• Simpler is better

• Should run on Linux

• Support for architectures that are not x86(_64) a huge bonus

• Must not require guest operation system modifications

1998 called, they are disappointed

• We expect to operate this for more than 6 months, so IPv6 is a requirement.

• There are new and exciting problem to solve, but it’s past time

• It’s a great way to piss off vendors

• Best Part: Everything is Internet Addressable!

Decision Timehttp://www.flickr.com/photos/inafrenzy/5787848646/

Hypervisor

• Scalability: No changes needed for 2-2000 VMs

• Speed: Fast. Especially when using virtio drivers

• Monitoring: Lots of support for existing systems, hooks for custom ones

• Uptime: Kernel module and userspace app. Easy to patch. Supports live migration

• Security: Built into kernel, lots of eyes.

• Cost: Free

Storage• Scalability: Works from gigabytes to exabytes

• Speed: Easy to deploy, IOPS limited by hardware

• Monitoring: Userspace apps, easy to monitor health of hardware. Software monitoring getting better all the time

• Uptime: Userspace apps. Designed for high availability

• Security: Provides isolation layers, not directly accessible to tenants

• Cost: Free*

Physical Networking Hardware

• Scalability: Pizza boxes, just buy more

• Speed: Based on Broadcom Trident platform

• Monitoring: (software)

• Uptime: These guys make the switches for top tier OEMs

• Security: (software)

• Cost: Extremely Affordable (about the cost of a server)

Physical Networking Software

• Scalability: Designed for spine & leaf and fat-tree architectures. Runs Linux natively.

• Speed: Limited only by hardware

• Monitoring: It’s Linux!

• Uptime: Designed to meet our model

• Security: It’s Linux!

• Cost: Extremely Affordable (fraction of hardware)

Logical Networking Software

• Scalability: Scales out with the rest of the cluster

• Speed: Low overhead

• Monitoring: SNMP and SFLOW

• Uptime: No control plane has no single point of failure. We designed around HV node being failure point.

• Security: Everyone is on their own network. Shared NOTHING.

• Cost: Worth Every Penny

Who needs spanning tree?

SFP+ Leaf

10/100/1000 Edge

SFP+ Leaf

QSFP+ Spine QSFP+ Spine

East PodWest Pod

North Pod

10/100/1000 Edge

Automation

• Scalability: No Problem

• Speed: High speed, low drag

• Monitoring: Easy

• Uptime: If the server goes down for maintenance, we keep running just not changing

• Security: No open ports!

• Cost: Depends

Internet & SAN Access

• Scalability: Scales out with the rest of the cluster

• Speed: Blazing

• Monitoring: SNMP and SFLOW

• Uptime: Using multiple switches each in it’s own failure domain to allow for maintenance/upgrades

• Security: Proven in the harshest environments

• Cost: Best in class

Wait...Did you just say SAN?

“If only you had an Open SourceCloud Infrastructure

Orchestration Platform”-Ron Pedde

HA Solution

• Scalability: Somewhat Limited, but that’s OK

• Speed: Impressive

• Monitoring: Complicated

• Uptime: Trusting the vendors on this one

• Security: The enterprise better not be wrong

• Cost: OUCH

Attention CTOs:Avert your eyes now

HARDCORE HARDWARE

What Customers See

What Power Users See

Questions?Will be at booth to answer questions when not in sessions

(or leave a card - no SPAM I promise)http://slideshare.net/edolnx/presentations

Recommended