Dreamhost deploying dreamcompute at scale

Preview:

Citation preview

Cumulus and Akanda at DreamHost

Driving Scale, Efficiency, and Cost Reduction

Presenters: Jonathan LaCour (DreamHost), Nolan Leake (Cumulus Networks) & Mark McClain (Akanda)

Introduction

▪ Founded in 1997▪ Managed, mass-market web hosting▪ ~400,000 customers▪ Why Cloud?

▪ The rise of AWS▪ The world needs a viable, open alternative

• Ceph and OpenStack lead the way!

• Public cloud compute service• Built on OpenStack and Ceph• Core networking requirements

• L2 tenant isolation• IPv6• 10G+ everywhere

Network: Gen 1

▪ Physical: White Box switches running Cumulus Linux▪ L2 isolation: virtualized with Nicira NVP▪ L3+

▪ Nicira lacks L3▪ Software routing vendors don’t understand cloud▪ Astara is born!

▪ Nicira / VMWare adds L3 ▪ Time for a bake-off!▪ Astara wins the battle, but gets some enhancements

▪ Move from OpenBSD and PF to Linux and iptables▪ Significant optimizations to orchestration platform

▪ Gen 2 allows us to scale to 1,000+ customers, thousands of VMs

Network: Gen 2

DreamCompute Network: Generation 3

▪ VMWare NSX problems▪ Scale: maxes out around 1,250 tenants▪ Performance: OVS is slow and unstable▪ Magic: difficult to debug and operate

▪ Gen 3 is built on open▪ Physical: Cumulus Linux▪ L2 isolation: hardware accelerated VXLAN in switch and hypervisor▪ L3+: Astara

Network: Gen 3

DreamCompute Network: Generation 3

▪ Simple, open architecture▪ Operational ease

▪ Proven technology: VXLAN, iptables, Linux networking stack▪ Astara simplifies Neutron deployment

▪ Performance and scale▪ Hardware accelerated VXLAN pervasive on switches / NICs▪ VXLAN tunnels scale up massively▪ Astara model of virtual network appliances scales easily

Gen 3 Benefits

● Created to fill in gaps in Neutron● L3-L7 Service Orchestration for OpenStack

○ Dynamic Routing○ IPv6

● Simplified Operations○ Using standard APIs

● Astara Project○ Open Source○ OpenStack Foundation top-level project

Reference Neutron

Message QueueNeutron Server

L2 Agent

L3 Agent

DHCP Agent

Adv ServicesDatabase

Astara + OpenStack Neutron

Message QueueNeutron Server

L2 AgentAstara

Database

Astara + OpenStack Neutron

AstaraPhysical Network (L2)

Nova

Neutron

Open: OVS/LinuxBridge Proprietary

Astara OTT Platform (L2 Agnostic)

Astara Network Services: Routing/LB/FW/VPN

OpenStack APIs

Physical Network

Traditional L2-centric Design Falls Short

▪ Bottleneck!▪ Core/Agg limits scale▪ Dead Agg switch is a Big Deal

▪ Complex, Proprietary▪ MLAG/vPC/Stacking▪ HSRP/GLBP/NSRP▪ Alphabet soup

▪ Complex Failure Modes▪ Loops▪ MAC flapping▪ Large blast radius

▪ Scalability▪ Limited total network size▪ Limited number of VLANs

A Better Way

L2 L3

IP Fabric: CLOS/Fat-tree

▪ No Bottleneck!▪ Full bandwidth across racks▪ Crucial for network

virtualization▪ Simple, Open

▪ IP▪ BGP

▪ Fine-grained failures▪ BGP runs the Internet

▪ Scales up to any size▪ Just add more layers!

Open Networking: Bare-Metal Ecosystem

ONIE (Open Network Install Environment)

Automation and Monitoring

▪ Only way to effectively manage large numbers of switches!

▪ Choice of Automation Tools▪ DreamHost was already using Chef▪ But you can use any tool that works on Linux!

▪ Choice of Monitoring Tools▪ DreamHost was already using collectd+Graphite▪ SNMP still there for legacy monitoring systems▪ Other Options

▪ Elastic Search/LogStash/Kibana▪ Sensu▪ Even good old MRTG!

DreamCompute Gen 3 Details

VXLAN: L2 Virtualization over L3 IP Fabrics

▪ UDP tunnels between vswitches▪ Guest L2 traffic is safely encapsulated in L3 packets on the physical network▪ No L2 required in the physical network

▪ What about BUM Packets: Broadcast, Multicast, Unknown Unicast?▪ “Official” RFC7348 answer: Multicast

▪ Multicast is complex and scales poorly: disabled on most networks▪ Replicator

▪ Cumulus authored, Open Source daemon: https://github.com/CumulusNetworks/vxfld

▪ Replicates BUM packets to multiple unicast receivers▪ Can run on Linux switches, or Linux servers/hypervisor▪ Hardware-accelerated when run on Cumulus Linux

VXLAN: HW VTEP

▪ VTEP: “VXLAN Tunnel Encapsulation/Decapsulation Point”▪ Thing that encapsulates virtual network L2 traffic in L3 UDP packets for

physical transport ▪ Neutron-managed software VTEPs on hypervisors

▪ Encapsulations/Decapsulates packets for VMs▪ Cumulus-managed hardware VTEP to connect to non-virtual networks

▪ Encapsulates/Decapsulates packets from VMs to routers, appliances, etc▪ 100% in hardware, line rate.

Questions?

Extras

Neutron Reference

HV

HV

HV

HV

HV

HV

HV

HV

HV

HV

HV

HV

HV

HV

HV

Network Node

Network Node

Astara with VMs

HV

HV

HV

HV

HV

HV

HV

HV

HV

HV

HV

HV

HV

HV

HV

HV

HV

HV

HV

HV