42
OpenFlow Switch Limitations

OpenFlow Switch Limitations. Background: Current Applications Traffic Engineering application (performance) – Fine grained rules and short time scales

Embed Size (px)

Citation preview

OpenFlow Switch Limitations

Background: Current Applications

• Traffic Engineering application (performance)– Fine grained rules and short time scales– Coarse grained rules and long time scales

• Middlebox provision (perf + security)– Fine grained rule and long time scales

• Network services– Load balancer: fine-grained/short-time– Firewall:fine-grained/long-time

• Cloud Services– Fine grained/long-time scales

Background: Switch Design

TCAM

Switch CPU+Mem

Hash Table

NetworkController

13Mbs

35Mbs

250GB

OpenFlow Background: Flow Table Entries

• OpenFlow rules match on 14 fields– Usually stored in TCAM (TCAM is much smaller)– Generally 1K-10K entries.

• Normal switches– 100K-1000K entries– Only match on 1-2 fields

Background: Switch Design

TCAM

Switch CPU+Mem

Hash Table

NetworkController

13Mbs

35Mbs

250GB

OpenFlow Background: Network Events

• Packet_In (flow-table expect, pkt matches no rule)– Asynch from switch to controller

• Flow_mod (insert flow table entries)– Asynch from controller to switch

• Flow_timeout (flow was removed due to timeout)– Asynch from switch to controller

• Get Flow statistics (information about current flows(– Synchronous between switch & controller– Controller sends request, switch replies

Background: Switch Design

Switch CPU+Mem

NetworkController

13Mbs

35Mbs

From: TheoTo: Bruce

1. Check Flow table, If no match then Inform CPU

2. CPU create packet-in event, and sends to controller

3. Controller run code to process event

Background: Switch Design

Switch CPU+Mem

NetworkController

13Mbs

35Mbs

From: theo, to: bruce, send on port 1Timeout: 10 secs, count: 0

From: TheoTo: Bruce

2. CPU processes flow_mod and insert into TCAM

4. Controller creates flow event and sends a flow_mod event

Background: Switch Design

Switch CPU+Mem

NetworkController

13Mbs

35Mbs

From: theo, to: bruce, send on port 1Timeout: 10 secs, count: 1

From: TheoTo: Bruce

2. CPU processes flow_mod and insert into TCAM

4. Controller creates flow event and sends a flow_mod event

Background: Switch Design

Switch CPU+Mem

NetworkController

13Mbs

35Mbs

From: theo, to: bruce, send on port 1Timeout: 10 secs, count: 1From: Theo

To: Bruce

1. Check Flow table2. Found matching rule3. Forward packet4. Update the count

Background: Switch Design

Switch CPU+Mem

NetworkController

13Mbs

35Mbs

From: theo, to: bruce, send on port 1Timeout: 10 secs, count: 1From: Theo

To: John

1. Check Flow table2. No matching rule …

now we must talk to the controller

Background: Switch Design

Switch CPU+Mem

NetworkController

13Mbs

35Mbs

From: theo, to: ***, send on port 1Timeout: 10 secs, count: 1From: Theo

To: John

1. Check Flow table2. Found matching rule3. Forward packet4. Update the count

Background: Switch Design

Switch CPU+Mem

NetworkController

13Mbs

35Mbs

From: theo, to: ***, send on port 1Timeout: 10 secs, count: 1From: Theo

To: Cathy

1. Check Flow table2. Found matching rule3. Forward packet4. Update the count

Background: Switch Design

Switch CPU+Mem

35Mbs

From: theo, to: ***, send on port 1Timeout: 10 secs, count: 1

• Problem with Wild-card– Too general– Can’t find details of individual

flows– Hard to do anything fine-

grained

Background: Switch Design

Switch CPU+Mem

35Mbs

From: theo, to: bruce, send on port 1Timeout: 1secs, count: 1KFrom: theo, to: john, send on port 1Timeout: 10 secs, count: 1

• Doing fine-grained things• Think hedera– Find all elephant flows– Put elephant flows on diff path

• How to do this?– Controller sent get-stat request– Switch respond will all stats– Controller goes through each

request– Install special paths

Background: Switch Design

Switch CPU+Mem

35Mbs

From: theo, to: bruce, send on port 3Timeout: 1secs, count: 1KFrom: theo, to: john, send on port 1Timeout: 10 secs, count: 1

• Doing fine-grained things• Think hedera– Find all elephant flows– Put elephant flows on diff path

• How to do this?– Controller sent get-stat request– Switch respond will all stats– Controller goes through each

request– Install special paths

Problems with Switches

• TCAM is very small can only support a small number of rules– Only 1k per switch, endhost generate lots more flows

• Controller install entry for each flow increases latency– Takes about 10ms to install new rules

• So flow must wait!!!!!!

– Can install at a rate of 13Mbs but traffic arrives at 250Gbp

• Controller getting stats for all flows takes a lot resources– For about 1K, you need about MB – If you request every 5 seconds then you total:

Background: Switch Design

TCAM

Switch CPU+Mem

Hash Table

NetworkController

13Mbs

35Mbs

250GB

Problems with Switches

• TCAM is very small can only support a small number of rules– Only 1k per switch, endhost generate lots more flows

• Controller install entry for each flow increases latency– Takes about 10ms to install new rules

• So flow must wait!!!!!!

– Can install at a rate of 13Mbs but traffic arrives at 250Gbp

• Controller getting stats for all flows takes a lot resources– For about 1K, you need about MB – If you request every 5 seconds then you total:

Getting Around TCAM Limitation

• Cloud centric solutions– Use Placement tricks

• Data Center centric solutions– Use overlay: use placement tricks

• General technique: Difane– Use Detour routing

DiFANE

DiFane

• Creates a hierarchy of switches– Authoritative switches• Lots of memory• Collectively stores all the rules

– Local switches• Small amount of memory• Stores a few rules• For unknown rules route traffic to an authoritative

switch

Following packets

Packet Redirection and Rule Caching

23

Ingress Switch

Authority Switch

Egress Switch

First packetRedirect

Forward

Feedback:

Cache rules

Hit cached rules and forward

From: theoTo: bruce

Everything else

From: theoTo; cathy

Three Sets of Rules in TCAMType Priority Field 1 Field 2 Action Timeout

Cache Rules

210 00** 111* Forward to Switch B 10 sec209 1110 11** Drop 10 sec… … … … …

Authority Rules

110 00** 001* ForwardTrigger cache manager

Infinity

109 0001 0*** Drop, Trigger cache manager

… … … … …

Partition Rules

15 0*** 000* Redirect to auth. switch14 …… … … … …

24

In ingress switchesreactively installed by authority switches

In authority switchesproactively installed by controller

In every switchproactively installed by controller

Stage 1

25

The controller proactively generates the rules and distributes them to

authority switches.

Partition and Distribute the Flow Rules

26

Ingress Switch

Egress Switch

Distribute partition information Authority

Switch A

AuthoritySwitch B

Authority Switch C

reject

acceptFlow space

Controller

Authority Switch A

Authority Switch B

Authority Switch C

Stage 2

27

The authority switches keep packets always in the data plane and

reactively cache rules.

Following packets

Packet Redirection and Rule Caching

28

Ingress Switch

Authority Switch

Egress Switch

First packet Redirect

Forward

Feedback:

Cache rules

Hit cached rules and forward

A slightly longer path in the data plane is faster than going through the control plane

Bin-Packing/Overlay

Bin-Packing/Overlay

Virtual Switch

• Virtual switch has more Mem than hardware switch– So you can install a lot more rules in virtual

switches• Create an overlay between virtual switches– Install fine-grained in virtual switches– Install normal OSPF rules in HW– Can implement everything in virtual switch• Has overlay draw-backs.

Bin-Pack in data Centers

• Insight: traffic is between certain servers– If server placed together then their rules are only

inserted in one switch

Getting Around CPU Limitations

• Prevent controller from being in flow creation loop– Create clone rules

• Prevent controller from being in decision loops– Create forwarding groups

Clone Rules

• Insert a special wild card rule• When a packet arrives switch

makes a micro-flow rule itself– Micro-flow inherits all

properties of the wildcard rule

Switch CPU+Mem

35Mbs

From: theo, to: ***, send on port 1Timeout: 10 secs, count: 1

From: TheoTo: Bruce

Clone Rules

• Insert a special wild card rule• When a packet arrives switch

makes a micro-flow rule itself– Micro-flow inherits all

properties of the wildcard rule

Switch CPU+Mem

35Mbs

From: theo, to: ***, send on port 1Timeout: 10 secs, count: 1

From: TheoTo: Bruce

From: theo, to: Bruce, send on port 1Timeout: 10 secs, count: 1

Forwarding Groups

• What happens when there’s a failure?– Port 1 goes down?– Switch must inform the

controller

• Instead, have backup ports– Each rule also states backup

Switch CPU+Mem

35Mbs

From: theo, to: ***, send on port 1Timeout: 10 secs, count: 1From: theo, to: Bruce, send on port 1Timeout: 10 secs, count: 1

Forwarding Groups

• What happens when there’s a failure?– Port 1 goes down?– Switch must inform the

controller

• Instead, have backup ports– Each rule also states backup

Switch CPU+Mem

35Mbs

From: theo, to: ***, send on port 1, backup: 2Timeout: 10 secs, count: 1From: theo, to: Bruce, send on port 1, backup2Timeout: 10 secs, count: 1

• How do I do load balancing?– Something like ECMP?– Or server load-balancing?

• Currently,– Controller installs rules for each

flow do load balancing when installing

– Controller can do get stats, and load balance later

Switch CPU+Mem

35Mbs

From: theo, to: ***, send on port 1Timeout: 10 secs, count: 1From: theo, to: Bruce, send on port 1Timeout: 10 secs, count: 1

Forwarding Groups

• Instead, have port-groups– Each rule specifies a group of

ports to send on

• When micro-rule is create– Switch can assign ports to

micro-rules• in a round robin matter• Or based on probability

Switch CPU+Mem

35Mbs

From: theo, to: ***, send on port 1,2,4Timeout: 10 secs, count: 1From: theo, to: Bruce, send on port 1Timeout: 10 secs, count: 1

Forwarding Groups

• Instead, have port-groups– Each rule specifies a group of

ports to send on

• When micro-rule is create– Switch can assign ports to

micro-rules• in a round robin matter• Or based on probability

Switch CPU+Mem

35Mbs

From: theo, to: ***, send on port 1(10%), 2(90%)Timeout: 10 secs, count: 1From: theo, to: Bruce, send on port 2Timeout: 10 secs, count: 1

Getting Around CPU Limitations

• Prevent controller from polling switches– Introduce triggers:• Each rule has a trigger and sends stats to the controller

when the threshold is reached• E.g. if over 20 pkts match flow,

– Benefits of triggers:• Reduces the number entries being returned• Limits the amount of network traffic

Summary

• Switches have several limitations– TCAM space– Switch CPU

• Interesting ways to reduce limitations– Place more responsibility in the switch• Introduce triggers• Have switch create micro-flow rules from general rules