Upload
others
View
12
Download
0
Embed Size (px)
Citation preview
Dynamic Scheduling of Network Updates
Xin Jin Hongqiang Harry Liu, Rohan Gandhi, Srikanth Kandula, Ratul Mahajan,
Ming Zhang, Jennifer Rexford, Roger WaHenhofer
SDN: Paradigm ShiK in Networking
• Many benefits – Traffic engineering [B4, SWAN]
– Flow scheduling [Hedera, DevoFlow] – Access control [Ethane, vCRIB] – Device power management [ElasZcTree]
1
Controller • Direct, centralized updates of forwarding rules in switches
Network Update is Challenging
• Requirement 1: fast – The agility of control loop
• Requirement 2: consistent – No congesZon, no blackhole, no loop, etc.
2 Network
Controller
What is Consistent Network Update
3
Current State
A
D
C B
E F1: 5 F4: 5
F2: 5 F3: 10
Target State
A
D
C B
E
F1: 5
F4: 5
F2: 5 F3: 10
What is Consistent Network Update
• Asynchronous updates can cause congesZon • Need to carefully order update operaZons
4
Current State
A
D
C B
E F1: 5 F4: 5
F2: 5 F3: 10
Target State
A
D
C B
E
F1: 5
F4: 5
F2: 5 F3: 10
A
D
C B
E F4: 5 F1: 5
F3: 10 F2: 5
Transient State
ExisZng SoluZons are Slow
• ExisZng soluZons are staZc [ConsistentUpdate’12, SWAN’13, zUpdate’13]
– Pre-‐compute an order for update operaZons
5
F3
F4
F2
F1
StaZc Plan A
Target State Current State
A
D
C B
E F4: 5
F2: 5
F1: 5
F3: 10 A
D
C B
E
F1: 5
F4: 5
F2: 5 F3: 10
ExisZng SoluZons are Slow
• ExisZng soluZons are staZc [ConsistentUpdate’12, SWAN’13, zUpdate’13]
– Pre-‐compute an order for update operaZons
• Downside: Do not adapt to runZme condiZons – Slow in face of highly variable operaZon compleZon Zme
6
F3
F4
F2
F1
StaZc Plan A
OperaZon CompleZon Times are Highly Variable
7
>10x
• Measurement on commodity switches
OperaZon CompleZon Times are Highly Variable
8
• Measurement on commodity switches • ContribuZng factors
– Control-‐plane load – Number of rules – Priority of rules – Type of operaZons (insert vs. modify)
StaZc Schedules can be Slow
9
F3
F4
F2 F1 StaZc Plan B
F3
F4
F2
F1
StaZc Plan A
F1 F2 F3 F4
Time 1 2 3 4 5
F1 F2 F3 F4
Time 1 2 3 4 5
F1 F2 F3 F4
Time 1 2 3 4 5
F1 F2 F3 F4
Time 1 2 3 4 5
Target State Current State
A
D
C B
E F4: 5
F2: 5
F1: 5
F3: 10 A
D
C B
E
F1: 5
F4: 5
F2: 5 F3: 10
No staZc schedule is a clear winner under all condiZons!
Dynamic Schedules are AdapZve and Fast
10
F3
F4
F2 F1 StaZc Plan B
F3
F4
F2
F1
StaZc Plan A
Target State Current State
A
D
C B
E F4: 5
F2: 5
F1: 5
F3: 10 A
D
C B
E
F1: 5
F4: 5
F2: 5 F3: 10
No staZc schedule is a clear winner under all condiZons!
F3
F4
F2 F1
Dynamic Plan
Adapts to actual condiZons!
Challenges of Dynamic Update Scheduling
• ExponenZal number of orderings • Cannot completely avoid planning
11
Current State
A
D
C B
E F3: 5
F2: 5
F1: 5 F4: 5
F5: 10
Target State
A
D
C B
E
F1: 5
F3: 5 F4: 5
F5: 10
F2: 5
Challenges of Dynamic Update Scheduling
• ExponenZal number of orderings • Cannot completely avoid planning
12
Current State
A
D
C B
E F3: 5 F1: 5 F4: 5
F5: 10
Target State
A
D
C B
E
F1: 5
F3: 5 F4: 5
F5: 10
F2: 5 F2: 5
F2
Deadlock
Challenges of Dynamic Update Scheduling
• ExponenZal number of orderings • Cannot completely avoid planning
13
Current State
A
D
C B
E F3: 5
F2: 5
F4: 5
F5: 10
Target State
A
D
C B
E
F1: 5
F3: 5 F4: 5
F5: 10
F2: 5
F1
F1: 5
Challenges of Dynamic Update Scheduling
• ExponenZal number of orderings • Cannot completely avoid planning
14
Current State
A
D
C B
E
F2: 5
F4: 5
F5: 10
Target State
A
D
C B
E
F1: 5
F3: 5 F4: 5
F5: 10
F2: 5
F1 F3
F1: 5
F3: 5
Challenges of Dynamic Update Scheduling
• ExponenZal number of orderings • Cannot completely avoid planning
15
Current State
A
D
C B
E F4: 5
F5: 10
Target State
A
D
C B
E
F1: 5
F3: 5 F4: 5
F5: 10
F2: 5
F1 F3 F2
Consistent update plan
F2: 5
F1: 5
F3: 5
Dionysus Pipeline
16
Dependency Graph Generator
Update Scheduler
Network
Current State
Target State
Consistency Property
Encode valid orderings
Determine a fast order
Dependency Graph GeneraZon
17
Dependency Graph Elements
OperaZon node
Resource node (link capacity, switch table size) Node
Edge Dependencies between nodes
Path node
Current State
A
D
C B
E F3: 5
F2: 5
F1: 5 F4: 5
F5: 10
Target State
A
D
C B
E
F1: 5
F3: 5 F4: 5
F5: 10
F2: 5
Dependency Graph GeneraZon
18
Current State
A
D
C B
E F3: 5
F2: 5
F1: 5 F4: 5
F5: 10
Target State
A
D
C B
E
F1: 5
F3: 5 F4: 5
F5: 10
F2: 5
Move F3
Move F1
Move F2
Dependency Graph GeneraZon
19
Current State
A
D
C B
E F3: 5
F2: 5
F1: 5 F4: 5
F5: 10
Target State
A
D
C B
E
F1: 5
F3: 5 F4: 5
F5: 10
F2: 5
A-‐E: 5
Move F3
Move F1
Move F2
Dependency Graph GeneraZon
20
Current State
A
D
C B
E F1: 5 F4: 5
F5: 10
A-‐E: 5
5
Move F3
Move F1
Move F2
Target State
A
D
C B
E
F1: 5
F3: 5 F4: 5
F5: 10
F2: 5
F3: 5
F2: 5
Dependency Graph GeneraZon
21
A-‐E: 5 5
5
Move F3
Move F1
Move F2
Target State
A
D
C B
E
F1: 5
F3: 5 F4: 5
F5: 10
F2: 5
Current State
A
D
C B
E F3: 5
F2: 5
F1: 5 F4: 5
F5: 10
Dependency Graph GeneraZon
22
A-‐E: 5 5
5 5
Move F3
Move F1
Move F2
Target State
A
D
C B
E
F1: 5
F3: 5 F4: 5
F5: 10
F2: 5
Current State
A
D
C B
E F3: 5
F2: 5
F1: 5 F4: 5
F5: 10
Dependency Graph GeneraZon
23
A-‐E: 5 5
D-‐E: 0
5
5 5
5
Move F3
Move F1
Move F2
Target State
A
D
C B
E
F1: 5
F3: 5 F4: 5
F5: 10
F2: 5
Current State
A
D
C B
E F3: 5
F2: 5
F1: 5 F4: 5
F5: 10
Dependency Graph GeneraZon • Supported scenarios
– Tunnel-‐based forwarding: WANs – WCMP forwarding: data center networks
• Supported consistency properZes – Loop freedom – Blackhole freedom – Packet coherence – CongesZon freedom
• Check paper for details
24
Dionysus Pipeline
25
Dependency Graph Generator
Update Scheduler
Network
Current State
Target State
Consistency Property
Encode valid orderings
Determine a fast order
Dionysus Scheduling • Scheduling as a resource allocaZon problem
26
A-‐E: 5 5
D-‐E: 0
5
5 5
5
Move F3
Move F1
Move F2
Dionysus Scheduling • Scheduling as a resource allocaZon problem
27
A-‐E: 0
D-‐E: 0
5
5 5
5
Move F3
Move F1
Move F2
Deadlock!
Dionysus Scheduling • Scheduling as a resource allocaZon problem
28
A-‐E: 0 5
D-‐E: 0 5 5
5
Move F3
Move F1
Move F2
Dionysus Scheduling • Scheduling as a resource allocaZon problem
29
A-‐E: 0 5
D-‐E: 5 5
5
Move F3
Move F2
Dionysus Scheduling • Scheduling as a resource allocaZon problem
30
A-‐E: 0 5
5
Move F3
Move F2
Dionysus Scheduling • Scheduling as a resource allocaZon problem
31
A-‐E: 5 5 Move
F2
Dionysus Scheduling • Scheduling as a resource allocaZon problem
32
Move F2
Done!
Dionysus Scheduling • Scheduling as a resource allocaZon problem
• NP-‐complete problems under link capacity and switch table size constraints
• Approach – DAG: always feasible, criZcal-‐path scheduling – General case: covert to a virtual DAG – Rate limit flows to resolve deadlocks
33
CriZcal-‐Path Scheduling
• Calculate criZcal-‐path length (CPL) for each node
– Extension: assign larger weight to operaZon nodes if we know in advance the switch is slow
• Resource allocated to operaZon nodes with larger CPLs
34
A-‐B:5
CPL=3
C-‐D:0
5
CPL=1
5
5 5
5
CPL=1
CPL=2
CPL=2
CPL=1
Move F4
Move F3
Move F2
Move F1
CriZcal-‐Path Scheduling
• Calculate criZcal-‐path length (CPL) for each node
– Extension: assign larger weight to operaZon nodes if we know in advance the switch is slow
• Resource allocated to operaZon nodes with larger CPLs
35
A-‐B:0
CPL=3
C-‐D:0
5
CPL=1
5
5
5
CPL=1
CPL=2
CPL=2
CPL=1
Move F4
Move F3
Move F2
Move F1
Handling Cycles
• Convert to virtual DAG – Consider each strongly connected
component (SCC) as a virtual node
36
CPL=1 CPL=3
weight = 1 weight = 2
• CriZcal-‐path scheduling on virtual DAG – Weight wi of SCC: number of operaZon nodes
A-‐E: 5 5
D-‐E: 0
5
5 5
5
Move F3
Move F1
Move F2
Move F2
Handling Cycles
• Convert to virtual DAG – Consider each strongly connected
component (SCC) as a virtual node
37
CPL=1 CPL=3
weight = 1 weight = 2
• CriZcal-‐path scheduling on virtual DAG – Weight wi of SCC: number of operaZon nodes
A-‐E: 0 5
D-‐E: 0 5 5
5
Move F3
Move F1
Move F2
Move F2
EvaluaZon: Traffic Engineering
38
0
0.5
1
1.5
2
2.5
3
3.5
WAN TE Data Center TE
Upd
ate Time (secon
d)
50th PercenZle Update Time
OneShot
Dionysus
SWAN
Not congesZon-‐free
CongesZon-‐free
CongesZon-‐free
EvaluaZon: Traffic Engineering
39
Improve 50th percenZle update speed by 80% compared to staZc scheduling (SWAN), close to OneShot
0
0.5
1
1.5
2
2.5
3
3.5
WAN TE Data Center TE
Upd
ate Time (secon
d)
50th PercenZle Update Time
OneShot
Dionysus
SWAN
Not congesZon-‐free
CongesZon-‐free
CongesZon-‐free
EvaluaZon: Failure Recovery
40
0
1
2
3
4
5
OneShot Dionysus SWAN
Link Oversub
scrip
Zon (GB)
99th PercenZle Link OversubscripZon
0
0.5
1
1.5
2
2.5
3
OneShot Dionysus SWAN
Upd
ate Time (secon
d)
99th PercenZle Update Time
EvaluaZon: Failure Recovery
41
Reduce 99th percenZle link oversubscripZon by 40% compared to staZc scheduling (SWAN)
Improve 99th percenZle update speed by 80% compared to staZc scheduling (SWAN)
0
1
2
3
4
5
OneShot Dionysus SWAN
Link Oversub
scrip
Zon (GB)
99th PercenZle Link OversubscripZon
0
0.5
1
1.5
2
2.5
3
OneShot Dionysus SWAN
Upd
ate Time (secon
d)
99th PercenZle Update Time
Conclusion
• Dionysus provides fast, consistent network updates through dynamic scheduling – Dependency graph: compactly encode orderings – Scheduling: dynamically schedule operaZons
42
Dionysus enables more agile SDN control loops
Thanks!
43