30
CCA - NoDerivs 3.0 Unported License - Usage OK, no modifications, full attribution* * All unlicensed or borrowed works retain their original licenses OpenStack Scale-out Networking Architecture Abhishek Chanda, Software Engineer OpenStack Juno Design Summit May 13th, 2014 Scaling to 1,000+ servers without Neutron @rony358

OpenStack Scale-out Networking Architecture

Embed Size (px)

DESCRIPTION

A detailed description of how Cloudscaling's Open Cloud System (OCS) has solved the network scalability problems in OpenStack. We'll cover how and why we designed a Layer-3 (L3) scale-out network, how we plugin and extend OpenStack, and talk about why we did it this way.

Citation preview

Page 1: OpenStack Scale-out Networking Architecture

CCA - NoDerivs 3.0 Unported License - Usage OK, no modifications, full attribution*!* All unlicensed or borrowed works retain their original licenses

OpenStack Scale-out Networking Architecture

Abhishek Chanda, Software Engineer!OpenStack Juno Design Summit!

May 13th, 2014

Scaling to 1,000+ servers without Neutron!

@rony358

Page 2: OpenStack Scale-out Networking Architecture

Abhishek Chanda

• Engineer at Cloudscaling!

• Started with a focus on software defined networking and distributed systems!• Ended up as jack of all trades!!

• Still bad at configuring network devices!

2

Page 3: OpenStack Scale-out Networking Architecture

Today’s Goals

3

• Our Layer-3 (L3) network architecture!• Motivation!

• Design goals!

• Network topology!

• Software components layout!

• Under the hood!

• Challenges Faced!

• Future Directions (SDN, Neutron etc.)

Page 4: OpenStack Scale-out Networking Architecture

Networking Modes in Vanilla OpenStack

using nova-network

Page 5: OpenStack Scale-out Networking Architecture

OpenStack Networking Options

5

Single-Host"OpenStack Networking

Multi-Host"OpenStack Networking

Flat OpenStack Networking

OCS "Classic

Networking

OCS "VPC

Networking

Scalability"& Control 0 1 2 4 4

Reference Network "Architecture

Layer-2 VLANs w/ STP

Layer-2 VLANs w/ STP

Layer-2 VLANs w/ STP

L3 Spine & Leaf + scale-out NAT

L3 Spine & Leaf Underlay +

network virtualization

Design Pattern

Centralized network stack

Decentralized network stack

Centralized network stack Fully Distributed Fully Distributed

Core Design Flaw

All traffic through a single x86

server

NAT at Hypervisor;

802.1q tagging

All traffic through a single

x86 server

Requires deeper

networking savvy

More bleeding edge (scale not

proven)

IssuesSPOF;

Performance bottleneck;

No SDN

Security & QoS issues;

No SDN

No control plane scalability; No

SDNNo virtual networks

No VLAN support

Page 6: OpenStack Scale-out Networking Architecture

Neutron Routernova-network

Option 1 - OpenStack Single-Host

6

VLAN

VLAN

VLAN

VLAN

VLAN

VLAN

All traffic through centralized bottleneck

and single point of failure

Performance bottleneck; non-standard networking arch.

Page 7: OpenStack Scale-out Networking Architecture

Option 2 - OpenStack Multi-Host

7

Core Network

Hypervisor

nova-network

Public IPs

Private IPs

NAT

Separate NICs cut available bandwidth

by 50%

Direct database access increases risk

NAT at hypervisor means routing

protocols to hypervisor or VLANs + STP across racks for gratuitous ARP

Must route Public IPs across core

Security problem; non-standard networking arch.

Page 8: OpenStack Scale-out Networking Architecture

Option 3 - OpenStack Flat

8

Networking scalability, but no control plane scalability

Uses ethernet adapters configured as bridges to allow network traffic between nodes!!Commonly used in POC and development environments

nova-networknova-scheduler

nova-api

nova-compute

Controller Compute

your local network IP address space

(eth1) (eth1)

10.0.0.0/8

br100(eth0)

br100(eth0)

Page 9: OpenStack Scale-out Networking Architecture

What Path Have Others Chosen?

• HP, CERN, ANL and others!

• CERN uses a custom driver which talks to a DB that maps IP to MAC addresses (amongst other attributes)!• Essentially flat with manually created VLANs

assigned to specific compute nodes!

• ANL added InfiniBand and VxLAN support to nova-network

9

Page 10: OpenStack Scale-out Networking Architecture

The L3 Network Architecture in OCS

Page 11: OpenStack Scale-out Networking Architecture

Design Goals• Mimic Amazon EC2 “classic” networking!• Pure L3 is blazing fast and well understood!

• Network/systems folks can troubleshoot easily!

• No VLANs, no Spanning Tree Protocol (STP), no L2 mess!• No SPOFs, smaller failure domains!

• E.g. single_host & flat mode!

• Distributed “scale-out” DHCP!• No virtual routers!!

• Path is vm->host->ToR->core

11

This enables a horizontally scalable stateless NAT layer that provides floating IPs

Page 12: OpenStack Scale-out Networking Architecture

Layer 3 (L3) Networks Scale

12

Largest cloud operators are L3 with NO VLANs

Cloud-ready apps don’t need or want VLANs (L2)

An L3 networking model is ideal underlay for SDN

overlay

The Internet Scales!

Page 13: OpenStack Scale-out Networking Architecture

Why NOT L2?

13

L3 L2Hierarchical topology Flat topology

Route aggregation No route aggregation / everything everywhere

Fast convergence times Fast convergence times only when tuned properly

Locality matters Locality disappears

Use all available bandwidth (via ECMP) using multiple paths

Uses half of available bandwidth and most traffic takes a long route

Proven scale STP/VLANs work at small to medium scale only

The Internet (& ISPs) are L3 oriented Typical enterprise datacenter is L2 oriented

Best practice for SDN “underlay” SDN “overlay” designed to provide L2 virt. nets

Page 14: OpenStack Scale-out Networking Architecture

Smaller Failure Domains

14

Would you rather have the whole cloud down !or just a small bit of it for a short time?

vs

Page 15: OpenStack Scale-out Networking Architecture

Spine & Leaf

15

Page 16: OpenStack Scale-out Networking Architecture

Simple Network View

16

Page 17: OpenStack Scale-out Networking Architecture

Software Components Schematic Layout

17

Software!Components

Hardware!Components• Nova Network is distributed

& synchronized!• Means we can have

many running at once) !• This drives horizontal

network scalability by adding more network managers

Zone Node

nova-network

Compute Node

ToR

Core Network

nova-!db

ToR

Page 18: OpenStack Scale-out Networking Architecture

Software Components Schematic Layout

18

Software!Components

Hardware!Components• VIF driver on each compute node!

• Bridge creation on each vm (/30)!• Enhanced iptables rules!• Per vm udhcpd process!• Configures routing

Zone Node

nova-!network

Compute Node

ToR

Core Network

nova-!db

VM

DHCP Server

VIF!Driver

ToR

Page 19: OpenStack Scale-out Networking Architecture

Software Components Schematic Layout

19

Software!Components

Hardware!Components

• NAT service on the edge!• Provides on demand

elastic IP service

Zone Node

nova-!network

Compute Node

ToR

Natter

Core Network

Edge Network

nova-!db

VM

DHCP Server

VIF!Driver

ToRProvides utilities to create a number of /30 networks per

host and pin them to host (l3addnet)

Page 20: OpenStack Scale-out Networking Architecture

Under the Hood: Natter

20

Zone Node

nova-db

Natter

Core Network

Edge Network

ConnectionControl Plane Data

1 Polls nova-db for new <floating_ip, private_ip> tuples

2 Use tc to install 1:1 NAT!rules in eth0

(eth1)

(eth0)

Page 21: OpenStack Scale-out Networking Architecture

Under the Hood: Natter

21

Page 22: OpenStack Scale-out Networking Architecture

22

Under the Hood: VIF Driver and Nova-Network

Zone!Node

ToR

nova-network

Compute Node

ToR

VIF!Driver

Core Network

Natter

VM Provisioning"1) VIF: Build linux bridge!2) NM: Get host!3) NM: Get all available networks for host!4) NM: Find first unused network for the host!5) VIF: Create a VIF!6) VIF: Sets up and starts udhcpd on host per VM MAC is calculated based on the IP!7) VIF: Creates a /30 network for the VM, assigns one address to the bridge, one to the VM!8) VIF: Adds routes to enable VM to gateway traffic!9) VIF: Adds iptables rules to enable blocked ! networks and whitelisted hosts!!VM Decommissioning"1) VIF: Stop udhcpd for the bridge the VM is

attached to and remove config!2) NM: Delete all IPs in all VIFs!3) NM: Cleanup linux bridge!4) NM: Cleanup all /30 networks!

Page 23: OpenStack Scale-out Networking Architecture

Under the Hood: l3addnet

23

Used by cloud admins to pin networks to hosts!Wrapper around nova-manage network create !

• Breaks down the input CIDR into /30 blocks!

• Loops through each block and calls the nova-manage API to create a network on that compute host

[email protected]:~# l3addnet Usage: l3addnet cidr host01 dns1 dns2 [email protected]:~# l3addnet 10.50.0.0/24 10.18.1.12 8.8.8.8 8.8.4.4

Page 24: OpenStack Scale-out Networking Architecture

Network Driver Challenges• OpenStack releases are moving targets!• Plugin interfaces change!

• Library dependencies change!

• Database API not rich/efficient enough!• Straight to SQL to get what we needed!

• nova-network supposed to be deprecated?!• First in Folsom or Grizzly? Then Havana??!

• Have to figure out our Neutron strategy

24

Page 25: OpenStack Scale-out Networking Architecture

Why Not Neutron Now?

• Created in Diablo timeframe !

• Neutron still not stable!• API changes and interfaces are actively hostile!

• No multi-host support!

• Complicated, non-intuitive maintenance procedures!

• Not all plugins are equally stable!

• many are outright buggy

25

Page 26: OpenStack Scale-out Networking Architecture

SDN in OCS• OpenContrail only one to meet our rqmts!• OCS ref network arch ideal “underlay”!

• SDN underlays usually are spine and leaf!• L3 routing does not interfere with supporting

encapsulation or tunneling protocols!

• Customers can choose network model!• VPC or L3

26

A large customer who wants to seamlessly support autoscaling for its tenants is a perfect use-case for VPC

Page 27: OpenStack Scale-out Networking Architecture

Example Packet Path - L3*

27

Edge Routers!(“egress”)

Core Routers!(“spine”)

ToR Switches!(“leaf”)

VMs

Router (L3)

Switch (L2)

Ethernet Domain

!IP Traffic

!Encapsulated Traffic (L3oL3)

Internet

Linux Bridge on compute node

* natters not shown for simplicity purposes!

Page 28: OpenStack Scale-out Networking Architecture

Example Packet Path

28

Edge Routers!(“egress”)

Core Routers!(“spine”)

ToR Switches!(“leaf”)

VMs

Router (L3)

Switch (L2)

Ethernet Domain

!IP Traffic

!Encapsulated Traffic (L3oL3)

Internet

Virtual Network Virtual NetworkLayer 3

Physical Network

Network Virtualization

Layer 3 Virtual Networksfor Customers

& Apps

AbstractionvRouter/vSW

Page 29: OpenStack Scale-out Networking Architecture

Future Directions• OCS L3 networking migrates to Neutron!

• As networking plugin (beyond nova-network replacement)!

• OCS VPC w/ more advanced SDN capabilities!• NFV combined with Service Chaining for Carriers!

• Support existing physical network assets with Service Chaining

29

Page 30: OpenStack Scale-out Networking Architecture

Questions?

Abhishek Chanda@rony358