14
© British Telecommunications plc 1 Network Performance Isolation in Data Centres using ConEx Congestion Policing draft-briscoe-conex-policing-01 draft-briscoe-conex-data-centre-02 Bob Briscoe Chief Researcher, BT IRTF DCLC Jul 2014 Bob Briscoe’s work is part-funded by the European Community under its Seventh Framework Programme through the Trilogy 2 project (ICT-317756)

Network Performance Isolation in Data Centres using … Performance Isolation in Data Centres using ConEx Congestion Policing draft-briscoe-conex-policing-01 draft-briscoe-conex-data-centre-02

  • Upload
    lecong

  • View
    217

  • Download
    0

Embed Size (px)

Citation preview

© British Telecommunications plc

1

Network Performance Isolation in Data Centres using ConEx Congestion Policing draft-briscoe-conex-policing-01 draft-briscoe-conex-data-centre-02

Bob Briscoe Chief Researcher, BT IRTF DCLC Jul 2014

Bob Briscoe’s work is part-funded by the European Community under its Seventh Framework Programme through the Trilogy 2 project (ICT-317756)

© British Telecommunications plc

2

purpose of talk

•  work proposal for the data centre latency control r-g –  data centre queuing delay control –  designed for global scope (inter-data-centre,... Inter-net) –  this talk: adds first step: intra-data centre

•  without any new protocols

•  started in the IETF congestion exposure (ConEx) w-g •  generalised for initial deployment without ConEx

–  and even without ECN end-to-end –  now even without ECN on switches (in slides, not draft)

© British Telecommunications plc

3

Network Performance Isolation in Data Centres •  An important problem

–  isolating between tenants, or departments –  virtualisation isolates CPU / memory / storage –  but network and I/O system

is highly multiplexed & distributed •  SDN-based (edge) capacity partitioning*

–  configuration churn: nightmare at scale –  poor use of capacity •  edge-based weighted round robin (or WFQ) –  More common –  but biases towards heavy hitters (no concept of time)

bit-rate

time bit-rate

time

* every problem in computer science can be solved by not thinking

then hiding the resulting mess under a layer of abstraction

© British Telecommunications plc

4

Outline Design – First Step edge bottlenecks by capacity design

•  Edge policing like Diffserv –  but congestion policing (per guest)

•  isolation within FIFO queue

•  no config on switches

hosts switches

VM sender VM receiver congestion policer

guest OS hypervisor switching

w

© British Telecommunications plc

5

bottleneck congestion policer

foreach pkt { i = classify_user(pkt) di += wi*(tnow-ti) //fill ti = tnow di -= s * p //drain if (di<0) {drop(pkt)}

} s: packet size p: drop prob of AQM

•  in a well-provisioned link, policer rarely intervenes •  but whenever needed, it limits queue growth

network policer

incoming packet stream

meter

w1 w2 wi

… di(t)

ci congestion

token bucket

Y

FIFO buffer

p(t)

AQM

outgoing packet stream

© British Telecommunications plc

6

wi

di(t)

ci

kwi

C

AQM

p(t)

policer

main congestion token bucket

meter

congestion burst limiter

police if either bucket empties

actually each bucket needs to be two buckets to limit bursts of congestion

•  similar code –  except 2 token buckets

...

if (di1<0 || di2<0) {drop(pkt)}

© British Telecommunications plc

7

performance isolation outcome

•  WRR or WFQ

•  congestion policer –  with unequal traffic loads

•  congestion policer –  treats equal traffic loads

equivalently to WRR

time

rate

time

rate

time

rate

© British Telecommunications plc

8

Outline Design edge and core queue control •  Edge policing like Diffserv

–  but congestion policing (per-guest) •  Hose model •  intra-class isolation in all FIFO queues

•  FIFO ECN marking on L3 switches •  no other config on switches

hosts switches

VM sender VM receiver TEP / congestion policer TEP / audit

guest OS hypervisor switching

w

© British Telecommunications plc

9

trusted path congestion feedback

•  Initial deployment –  all under control of infrastructure admin

•  ECN on guest hosts: optional –  ECN enabled across tunnel

•  ConEx on guest hosts: optional –  any ConEx-enabled packet

doesn’t require tunnel feedback

•  details – see spare slide or draft infrastructure

policer

transport sender

transport receiver

TEP TEP feedback between tunnel endpoints

Non-ConEx packets

infrastructure

policer audit

transport sender

transport receiver

ConEx packets

© British Telecommunications plc

10

Features of Solution

•  Network performance isolation between tenants •  No loss of LAN-like multiplexing benefits

•  work-conserving •  Zero (tenant-related) switch configuration •  No change to existing switch implementations •  Weighted performance differentiation •  Simplest possible contract

•  per-tenant network-wide allowance •  tenant can freely move VMs around without changing allowance •  sender constraint, but with transferable allowance

•  Transport-Agnostic •  Extensible to wide-area and inter-data-centre interconnect

© British Telecommunications plc

11

call for interest

•  implementation in hypervisors •  evaluation

Network Performance Isolation in Data Centres using congestion policing draft-briscoe-conex-policing-01 draft-briscoe-conex-data-centre-02

Q&A & spare slides

© British Telecommunications plc

13

measuring contribution to congestion = bytes weighted by congestion level = bytes dropped (or ECN-marked) = ‘congestion-volume’ as simple to measure as volume

bit-rate

time

congestion

time

10GB

0.01% congestion

1MB

1% congestion

1MB

300MB 100MB

3MB

1%

0.01%

unilateral deployment technique for data centre operator

•  for e2e transports that don’t support ECN, the operator can: 1.  at encap: alter 00 to 10 in outer 2.  at interior buffers: turn on ECN

•  defers any drops until egress •  audit just before egress

can see packets to be dropped 14

•  exploits: •  widespread edge-edge tunnels in multi-tenant DCs to isolate forwarding •  a side-effect of standard tunnelling (IP-in-IP or any ECN link encap)

DS

ECN

ingress egress

DS

ECN

DS

ECN

DS

ECN

DS

ECN

DS

ECN

E

congested network element

DS

ECN

DS

ECN

E

1 2

00 → 10

10 → 11

11 → drop

A

audit A

pol- icer

•  for e2e transports that don’t support ConEx, the operator can create its own trusted feedback: 3.  at decap: only for Not-ConEx

packets, feedback aggregate congestion marking counters:

•  CE outer, Not-ECT inner = loss •  CE outer, ECT inner = ECN

exporter collector IPFIX

meter 3