Transcript
Page 1: Symbiotic Routing in Future Data Centers Hussam Abu-Libdeh, Paolo Costa, Antony Rowstron, Greg OShea, Austin Donnelly Cornell University Microsoft Research

Symbiotic Routingin Future Data Centers

Hussam Abu-Libdeh, Paolo Costa, Antony Rowstron,Greg O’Shea, Austin Donnelly

Cornell University Microsoft Research Cambridge

1

Page 2: Symbiotic Routing in Future Data Centers Hussam Abu-Libdeh, Paolo Costa, Antony Rowstron, Greg OShea, Austin Donnelly Cornell University Microsoft Research

Data center networking• Network principles evolved from Internet systems• Multiple administrative domains• Heterogeneous environment

• But data centers are different• Single administrative domains• Total control over all operational aspects

• Re-examine the network in this new setting

2

Page 3: Symbiotic Routing in Future Data Centers Hussam Abu-Libdeh, Paolo Costa, Antony Rowstron, Greg OShea, Austin Donnelly Cornell University Microsoft Research

Perf

orm

ance

Isol

ation

Band

wid

th

Faul

t Tol

eran

ce

Gra

cefu

l Deg

rada

tion

Scal

abili

ty

TCO

Com

mod

ity C

ompo

nent

s

. . .

Mod

ular

Des

ign

Rethinking DC networks• New proposals for data center network architectures• DCell, BCube, Fat-tree, VL2, PortLand …

• Network interface has not changed!

Network Interface

3

Page 4: Symbiotic Routing in Future Data Centers Hussam Abu-Libdeh, Paolo Costa, Antony Rowstron, Greg OShea, Austin Donnelly Cornell University Microsoft Research

Challenge• The network is a black box to applications• Must infer network properties

• Locality, congestion, failure …etc• Little or no control over routing

• Applications are a black box to the network• Must infer flow properties

• E.g. Traffic engineering/Hedera

• In consequence• Today’s data centers and proposals use a single protocol• Routing trade-offs made in an application-agnostic way

• E.g. Latency, throughput, …etc4

Page 5: Symbiotic Routing in Future Data Centers Hussam Abu-Libdeh, Paolo Costa, Antony Rowstron, Greg OShea, Austin Donnelly Cornell University Microsoft Research

CamCube• A new data center design

• Nodes are commodity x86 servers with local storage• Container-based model 1,500-2,500 servers

• Direct-connect 3D torus topology• Six Ethernet ports / server• Servers have (x,y,z) coordinates

• Defines coordinate space• Simple 1-hop API

• Send/receive packets to/from 1-hop neighbours• Not using TCP/IP

• Everything is a service• Run on all servers

• Multi-hop routing is a service• Simple link state protocol• Route packets along shortest paths from source to destination

5

(0,2,0)

x

y

z

Page 6: Symbiotic Routing in Future Data Centers Hussam Abu-Libdeh, Paolo Costa, Antony Rowstron, Greg OShea, Austin Donnelly Cornell University Microsoft Research

Development experience• Built many data center services on CamCube

• E.g.• High-throughput transport service

• Desired property: high throughput• Large-file multicast service

• Desired property: low link load• Aggregation service

• Desired property: distribute computation load over servers• Distributed object cache service

• Desired property: per-key caches, low path stretch

6

Page 7: Symbiotic Routing in Future Data Centers Hussam Abu-Libdeh, Paolo Costa, Antony Rowstron, Greg OShea, Austin Donnelly Cornell University Microsoft Research

Per-service routing protocols• Higher flexibility

• Services optimize for different objectives• High throughput transport disjoint paths

• Increases throughput

• File multicast non-disjoint paths• Decreases network load

7

Page 8: Symbiotic Routing in Future Data Centers Hussam Abu-Libdeh, Paolo Costa, Antony Rowstron, Greg OShea, Austin Donnelly Cornell University Microsoft Research

What is the benefit?• Prototype Testbed• 27 servers, 3x3x3 CamCube• Quad core, 4 GB RAM, six 1Gbps Ethernet ports

• Large-scale packet-level discrete event simulator• 8,000 servers, 20x20x20 CamCube• 1Gbps links

• Service code runs unmodified on cluster and simulator

8

Page 9: Symbiotic Routing in Future Data Centers Hussam Abu-Libdeh, Paolo Costa, Antony Rowstron, Greg OShea, Austin Donnelly Cornell University Microsoft Research

Service-level benefits• High throughput transport service• 1 sender 2000 receivers

• Sequential iteration• 10,000 packets/flow• 1500 bytes/packet

• Metric: throughput• Shown: custom/base ratio

9

0 1 2 3 4 50

0.25

0.5

0.75

1

Custom/Base Throughput Ratio

CDF

Flow

s

Page 10: Symbiotic Routing in Future Data Centers Hussam Abu-Libdeh, Paolo Costa, Antony Rowstron, Greg OShea, Austin Donnelly Cornell University Microsoft Research

Service-level benefits• Large-file multicast service• 8,000-server network• 1 multicast group• Group size: 0% 100% of servers

• Metric: # of links in multicast tree• Shown: custom/base ratio

10

0%10%

20%30%

40%50%

60%70%

80%90%

100%0

0.1

0.2

0.3

0.4

Number of servers in the group (%)

Link

s re

ducti

on

Page 11: Symbiotic Routing in Future Data Centers Hussam Abu-Libdeh, Paolo Costa, Antony Rowstron, Greg OShea, Austin Donnelly Cornell University Microsoft Research

Service-level benefits• Distributed object cache service• 8,000-server network• 8,000,000 key-object pairs

• Evenly distributed among servers• 800,000 lookups

• 100 lookups per server• Keys picked by Zipf distribution

• 1 primary + 8 replicas per key• Replicas unpopulated initially

• Metric: path length to nearest hit

11

0 5 10 15 20 25 300

0.25

0.5

0.75

1

Custom Routing

Base Routing

Path length

CDF

Look

ups

Page 12: Symbiotic Routing in Future Data Centers Hussam Abu-Libdeh, Paolo Costa, Antony Rowstron, Greg OShea, Austin Donnelly Cornell University Microsoft Research

Network impact• Ran all services simultaneously• No correlation in link usage• Reduction in link utilization

• Take-away: custom routing reduced network load and increased service-level performance

12

0 services 1 service 2 services 3 services 4 services0

0.2

0.4

0.6

Services per link

Frac

tion

of l

inks

Key-value Cache

Multicast

Fixed Path

Aggregation

High-Throughput T

ransport

00.20.40.60.8

1

Change in link utilization

Cust

om/b

ase

pack

et ra

tio

Page 13: Symbiotic Routing in Future Data Centers Hussam Abu-Libdeh, Paolo Costa, Antony Rowstron, Greg OShea, Austin Donnelly Cornell University Microsoft Research

Symbiotic routing relations• Multiple routing protocols running concurrently• Routing state shared with base routing protocol

• Services• Use one or more routing protocols• Use base protocol to simplify their custom protocols

• Network failures• Handled by base protocol• Services route for common case

13

Network

Base Routing Protocol

Routing Protocol 1 Routing Protocol 2 Routing Protocol 3

Service A Service B Service C

Page 14: Symbiotic Routing in Future Data Centers Hussam Abu-Libdeh, Paolo Costa, Antony Rowstron, Greg OShea, Austin Donnelly Cornell University Microsoft Research

Building a routing framework• Simplify building custom routing protocols

• Routing:• Build routes from set of intermediate points

• Coordinates in the coordinate space• Services provide forwarding function ‘F’• Framework routes between intermediate points

• Use base routing service• Consistently remap coordinate space on node failure

• Queuing:• Services manage packet queues per link• Fair queuing between services per link

14

Fpacket

local coordnext coord

Page 15: Symbiotic Routing in Future Data Centers Hussam Abu-Libdeh, Paolo Costa, Antony Rowstron, Greg OShea, Austin Donnelly Cornell University Microsoft Research

Example: cache service• Distributed key-object caching

• Key-space mapped onto CamCube coordinate space

• Per-key caches• Evenly distributed across coordinate space• Cache coordinates easily computable based on key

15

Page 16: Symbiotic Routing in Future Data Centers Hussam Abu-Libdeh, Paolo Costa, Antony Rowstron, Greg OShea, Austin Donnelly Cornell University Microsoft Research

Cache service routing• Routing• Source nearest cache or primary• On cache miss: cache primary

• Populate cache: primary cache

• F function computed at• Source• Cache• Primary

• Different packets can use different links• Accommodate network conditions

• E.g. congestion16

Fv FF

v

v

v

source/querier

nearest cache

primary server

Page 17: Symbiotic Routing in Future Data Centers Hussam Abu-Libdeh, Paolo Costa, Antony Rowstron, Greg OShea, Austin Donnelly Cornell University Microsoft Research

• On link failure• Base protocol routes around failure

• On replica server failure• Key space consistently remapped

by framework

• F function does not change• Developer only targets common case• Framework handles corner cases

Handling failures

17

F

v

source/querier

nearest cache

primary server

Page 18: Symbiotic Routing in Future Data Centers Hussam Abu-Libdeh, Paolo Costa, Antony Rowstron, Greg OShea, Austin Donnelly Cornell University Microsoft Research

Cache service F functionprotected override List<ulong> F(int neighborIndex, ulong currentDestinationKey, Packet packet) { List<ulong> nextKeys = new List<ulong>(); ulong itemKey = LookupPacket.GetItemKey(packet); ulong sourceKey = LookupPacket.GetSourceKey(packet);

if (currentDestinationKey == sourceKey) // am I the source? { // get the list of caches (using KeyValueStore static method) ulong[] cachesKey = ServiceKeyValueStore.GetCaches(itemKey);

// iterate over all cache nodes and keep the closest ones int minDistance = int.MaxValue; foreach (ulong cacheKey in cachesKey) { int distance = node.nodeid.DistanceTo(LongKeyToKeyCoord(cacheKey)); if (distance < minDistance) { nextKeys.Clear(); nextKeys.Add(cacheKey); minDistance = distance; } else if (distance == minDistance) nextKeys.Add(cacheKey); } }

else if (currentDestinationKey != itemKey) // am I the cache? nextKeys.Add(itemKey);

return nextKeys; }

18

extract packet details

if at source, route to nearest cacheor primary

if cache miss,route to primary

Page 19: Symbiotic Routing in Future Data Centers Hussam Abu-Libdeh, Paolo Costa, Antony Rowstron, Greg OShea, Austin Donnelly Cornell University Microsoft Research

Framework overhead• Benchmark performance• Single server in testbed• Communicate with all six 1-hop neighbors (Tx + Rx)• Sustained 11.8 Gbps throughput

• Out of upper bound of 12 Gbps

• User-space routing overhead

19

Baseline Framework0

20

40

60

80

100

CPU

Util

izati

on (%

)

Page 20: Symbiotic Routing in Future Data Centers Hussam Abu-Libdeh, Paolo Costa, Antony Rowstron, Greg OShea, Austin Donnelly Cornell University Microsoft Research

What have we done• Services only specify a routing “skeleton”• Framework fills in the details

• Control messages and failures handled by framework• Reduce routing complexity for services

• Opt-in basis• Services define custom protocols only if they need to

20

Page 21: Symbiotic Routing in Future Data Centers Hussam Abu-Libdeh, Paolo Costa, Antony Rowstron, Greg OShea, Austin Donnelly Cornell University Microsoft Research

Network requirements• Per-service routing not limited to CamCube

• Network need only provide:• Path diversity

• Providing routing options• Topology awareness

• Expose server locality and connectivity• Programmable components

• Allow per-service routing logic

21

Page 22: Symbiotic Routing in Future Data Centers Hussam Abu-Libdeh, Paolo Costa, Antony Rowstron, Greg OShea, Austin Donnelly Cornell University Microsoft Research

Conclusions• Data center networking from the developer’s perspective

• Custom routing protocols to optimize for application-level performance requirements

• Presented a framework for custom routing protocols• Applications specify a forwarding function (F) and queuing hints• Framework manages network state, control messages, and

remapping on failure

• Multiple routing protocols running concurrently• Increase application-level performance• Decrease network load

22

Page 23: Symbiotic Routing in Future Data Centers Hussam Abu-Libdeh, Paolo Costa, Antony Rowstron, Greg OShea, Austin Donnelly Cornell University Microsoft Research

Thank [email protected]

23

Page 24: Symbiotic Routing in Future Data Centers Hussam Abu-Libdeh, Paolo Costa, Antony Rowstron, Greg OShea, Austin Donnelly Cornell University Microsoft Research

Cache serviceInsert throughput

0 20 40 60 80 100 120 1400

0.5

1

1.5

2

2.5

3

3.5

4

F=3, disk

F=27, disk

F=3, no disk

F=27, no disk

Concurrent insert requests

Inse

rt th

roug

hput

(Gbp

s)

Disk I/O bounded

Ingress bandwidth bounded (3 front-ends)

24

Page 25: Symbiotic Routing in Future Data Centers Hussam Abu-Libdeh, Paolo Costa, Antony Rowstron, Greg OShea, Austin Donnelly Cornell University Microsoft Research

Cache serviceLookup requests/second

0 20 40 60 80 100 120 1400

20,000

40,000

60,000

80,000

100,000

120,000

140,000

F=3 F=27

Concurrent lookup requests

Look

up ra

te (r

eqs/

s) Ingress bandwidth bounded

25

Page 26: Symbiotic Routing in Future Data Centers Hussam Abu-Libdeh, Paolo Costa, Antony Rowstron, Greg OShea, Austin Donnelly Cornell University Microsoft Research

Cache serviceCPU Utilization on FEs

0 20 40 60 80 100 120 1400

10

20

30

40

50

60

70

80

90

100lookup (F=3)

insert (F=3, no disk)

insert (F=27, no disk)

lookup (F=27)

Concurrent requests

CPU

util

izati

on (%

)

3 front-ends

27 front-ends

26

Page 27: Symbiotic Routing in Future Data Centers Hussam Abu-Libdeh, Paolo Costa, Antony Rowstron, Greg OShea, Austin Donnelly Cornell University Microsoft Research

Camcube link latency

1,500-byte packets 9,000-byte packets0

100

200

300

400

500

600

700

800

900

1000UDP (x-cable)

Camcube (1 hop)

UDP (switch)

TCP (x-cable)

TCP (switch)

Roun

d tr

ip ti

me

(mic

rose

c)

27


Recommended