SWIFT: Predictive Fast Reroute
Thomas HolterbachETH Zürich / CAIDA
Stefano VissicchioAlberto DainottiLaurent Vanbever
UCLondonCAIDA, UC San DiegoETH Zürich
Joint work with
SIGCOMM24th August 2017
swift.ethz.ch
1
R2R1Focus
R3
backupnext-hop
primarynext-hop
Phase #1
Learningabout the failure
Updatingforwarding entries
Phase #2
BGP convergenceis a two-phase process
6
pos. prefix NH
1 1.0.0.0/24 22 1.0.1.0/24 23 1.0.2.0/24
600K 200.0.0.0/24
… … …
300K 2
… … …
100.0.0.0/24
R1’s forwarding table
Phase #1
Learningabout the failure
Updatingforwarding entries
Phase #2
2
2
R2R1Focus
R3
backupnext-hop
primarynext-hop
BGP convergenceis a two-phase process
7
pos. prefix NH
1 1.0.0.0/24 32 1.0.1.0/24 23 1.0.2.0/24
600K 200.0.0.0/24
… … …
300K 2
… … …
100.0.0.0/24
R1’s forwarding table
Phase #1
Learningabout the failure
Updatingforwarding entries
Phase #2
2
2
R2R1Focus
R3
backupnext-hop
primarynext-hop
BGP convergenceis a two-phase process
8
pos. prefix NH
1 1.0.0.0/24 32 1.0.1.0/24 33 1.0.2.0/24
600K 200.0.0.0/24
… … …
300K 2
… … …
100.0.0.0/24
R1’s forwarding table
Phase #1
Learningabout the failure
Updatingforwarding entries
Phase #2
2
2
R2R1Focus
R3
backupnext-hop
primarynext-hop
BGP convergenceis a two-phase process
9
pos. prefix NH
1 1.0.0.0/24 32 1.0.1.0/24 33 1.0.2.0/24
600K 200.0.0.0/24
… … …
300K 2
… … …
100.0.0.0/24
R1’s forwarding table
Phase #1
Learningabout the failure
Updatingforwarding entries
Phase #2
3
2
R2R1Focus
R3
backupnext-hop
primarynext-hop
BGP convergenceis a two-phase process
10
pos. prefix NH
1 1.0.0.0/24 32 1.0.1.0/24 33 1.0.2.0/24
600K 200.0.0.0/24
… … …
300K 3
… … …
100.0.0.0/24
R1’s forwarding table
Phase #1
Learningabout the failure
Updatingforwarding entries
Phase #2
3
2
R2R1Focus
R3
backupnext-hop
primarynext-hop
BGP convergenceis a two-phase process
11
pos. prefix NH
1 1.0.0.0/24 32 1.0.1.0/24 33 1.0.2.0/24
600K 200.0.0.0/24
… … …
300K 3
… … …
100.0.0.0/24
R1’s forwarding table
Phase #1
Learningabout the failure
Updatingforwarding entries
Phase #2
3
3R2R1
Focus
R3
backupnext-hop
primarynext-hop
BGP convergenceis a two-phase process
12
Phase #1
Learningabout the failure
Updatingforwarding entries
Phase #2
BGP timersBFD, …
PICMPLS, …
In practice, multiple techniques are often usedto speed up the convergence time
14
remotefailure
R2R1Focus
R3
backupnext-hop
primarynext-hop
Current fast reroute techniquesdo not work upon remote outages
R4
BGPwithdrawals
16
Remote outages are numerous
We analyzed BGP sessions from RouteViews and RIPE RIS collectors, and looked for bursts of BGP withdrawals
18
10 seconds sliding window
Remote outages are numerous
Time
# withdrawals
We analyzed BGP sessions from RouteViews and RIPE RIS collectors, and looked for bursts of BGP withdrawals
19
10 seconds sliding window
Remote outages are numerous
Time
# withdrawals
We analyzed BGP sessions from RouteViews and RIPE RIS collectors, and looked for bursts of BGP withdrawals
20
10 seconds sliding window
burst
Remote outages are numerous
Time
# withdrawals
We analyzed BGP sessions from RouteViews and RIPE RIS collectors, and looked for bursts of BGP withdrawals
21
We found ~3,000 bursts of withdrawals in November 2016 and on 213 BGP sessions
Remote outages are numerous
We analyzed BGP sessions from RouteViews and RIPE RIS collectors, and looked for bursts of BGP withdrawals
22
20 40 60 80
Burst durDtion (s)
0.0
0.2
0.4
0.6
0.8
1.0C
D)
Bursts lower thDn 10k
Bursts greDter thDn 10k
CDF
Burst duration (s)
Remote outages propagate slowly
23
20 40 60 80
Burst durDtion (s)
0.0
0.2
0.4
0.6
0.8
1.0C
D)
Bursts lower thDn 10k
Bursts greDter thDn 10k
CDF
Burst duration (s)
Remote outages propagate slowly
24
20 40 60 80
Burst durDtion (s)
0.0
0.2
0.4
0.6
0.8
1.0C
D)
Bursts lower thDn 10k
Bursts greDter thDn 10k
CDF
Burst duration (s)
37% last more than 10s
Remote outages propagate slowly
25
Remote outages result in downtime
Time (s)
0 20 40 60 80 100 120 140Time (s)
0
20
40
60
80
100
Pack
et lo
ss (%
)
Failu
rePacket loss (%)
27
Remote outages result in downtime
0 20 40 60 80 100 120 140Time (s)
0
20
40
60
80
100
Pack
et lo
ss (%
)
Failu
rePacket loss (%)
Time (s)
28
Remote outages result in downtime
0 20 40 60 80 100 120 140Time (s)
0
20
40
60
80
100
Pack
et lo
ss (%
)
Failu
rePacket loss (%)
Time (s)
110s
29
1. SWIFT reduces the learning time of a withdrawal from 13s to 2s using a control-plane prediction algorithm
SWIFT: Predictive Fast Reroute
32
1. SWIFT reduces the learning time of a withdrawal from 13s to 2s using a control-plane prediction algorithm
2. SWIFT fast reroutes the affected traffic towards unaffected paths in 40ms using a hierarchical forwarding table
SWIFT: Predictive Fast Reroute
33
1. SWIFT reduces the learning time of a withdrawal from 13s to 2s using a control-plane prediction algorithm
2. SWIFT fast reroutes the affected traffic towards unaffected paths in 40ms using a hierarchical forwarding table
3. SWIFT is deployable on existing routers
SWIFT: Predictive Fast Reroute
34
1. SWIFT reduces the learning time of a withdrawal from 13s to 2s using a control-plane prediction algorithm
2. SWIFT fast reroutes the affected traffic towards unaffected paths in 40ms using a hierarchical forwarding table
3. SWIFT is deployable on existing routers
SWIFT: Predictive Fast Reroute
35
37
Existing RCA techniques *
accuracy
# vantage points
detection speed
Lifeguard, SIGCOMM’14 Poiroot, SIGCOMM’13Feldmann et al., SIGCOMM’04
*
38
Existing RCA techniques *
link/AS
multiple
O(minute)
Lifeguard, SIGCOMM’14 Poiroot, SIGCOMM’13Feldmann et al., SIGCOMM’04
*
accuracy
# vantage points
detection speed
39
region
1
O(second)
SWIFTExisting RCA techniques *
link/AS
multiple
O(minute)
Lifeguard, SIGCOMM’14 Poiroot, SIGCOMM’13Feldmann et al., SIGCOMM’04
*
accuracy
# vantage points
detection speed
detect and locate the outage
(A,B) 0.3
(A,D) 0.9
… …
monitor the stream of BGP messages
Time
# withdrawals
Links failure probability
#1
#2
Link (A,D) is dead
46
detect and locate the outage
(A,B) 0.3
(A,D) 0.9
… … Link (A,D) is dead
predict future withdrawals
monitor the stream of BGP messages
Time
# withdrawals
predicted
Links failure probability
#1
#2
#3
47
21 5 6
7
8
3
SWIFT
UPDATES orWITHDRAWALS
10k prefixes
10k prefixes
SWIFT link failure inference algorithmleverages the affected and unaffected prefixes
1k prefixes
48
21 5 6
7
8
3
SWIFT
10k prefixes
10k prefixes
candidates set
(2,5) (5,6) (6,7) (6,8)AS links
1k prefixes
49
21 5 6
7
8
3
SWIFT
10k prefixes
10k prefixes
candidates set
(2,5) (5,6) (6,7) (6,8)
(2,5) (5,6) (6,7) (6,8)
affected prefixes
candidates set
(2,5) (5,6) (6,7) (6,8)AS links
1k prefixes
from AS7
50
21 5 6
7
8
3
SWIFT
10k prefixes
10k prefixes
from AS7 (2,5) (5,6) (6,7) (6,8)
from AS8 (2,5) (5,6) (6,7) (6,8)
affected prefixes
candidates set
(2,5) (5,6) (6,7) (6,8)
candidates set
(2,5) (5,6) (6,7) (6,8)AS links
1k prefixes
51
(2,5) (5,6) (6,7) (6,8)
21 5 6
7
8
3
SWIFT
10k prefixes
10k prefixes
from AS7 (2,5) (5,6) (6,7) (6,8)
from AS8 (2,5) (5,6) (6,7) (6,8)
affected prefixes
from AS5
unaffected prefixes
candidates set
(2,5) (5,6) (6,7) (6,8)
candidates set
(2,5) (5,6) (6,7) (6,8)AS links
1k prefixes
52
21 5 6
7
8
3
SWIFT
10k prefixes
10k prefixes
(2,5) (5,6) (6,7) (6,8)
(2,5) (5,6) (6,7) (6,8)
affected prefixes
unaffected prefixes(2,5) (5,6) (6,7) (6,8)
is the failed link
candidates set
(2,5) (5,6) (6,7) (6,8)
candidates set
(2,5) (5,6) (6,7) (6,8)AS links
1k prefixes
from AS7
from AS8
from AS5
53
In practice, SWIFT uses two metricsComputed on a per AS-link basis
Withdrawals share
WS (l, t)Path sharing
PS (l, t)
Fraction of withdrawals crossing link l
Proportion of prefixeswithdrawn on link l
54
Withdrawals share
WS (l, t)Path sharing
PS (l, t)
Fraction of withdrawals crossing link l
Proportion of prefixeswithdrawn on link l
FS (l, t) = X
Fit Scorefor link l
In practice, SWIFT uses two metricsComputed on a per AS-link basis
55
To be fast, we can’t wait for all the withdrawalsChallenge #1
Theorem Under perfect conditions, SWIFT alwaysreturns a set of links including the failed link
SWIFT runs the link inference early during the burst in order to predict future withdrawals
57
To be fast, we can’t wait for all the withdrawalsChallenge #1
Theorem Under perfect conditions, SWIFT alwaysreturns a set of links including the failed link
SWIFT runs the link inference early during the burst in order to predict future withdrawals
An outage can affect multiple AS linksChallenge #2SWIFT link inference algorithm returns a set of links, all with a high fit score
58
SWIFT link failure inference is accurate
False Positives Rate (%)
True Positives Rate (%)
to be avoided
61
SWIFT link failure inference is accurate
False Positives Rate (%)
True Positives Rate (%)
unnecessary traffic shifts
62
SWIFT link failure inference is accurate
False Positives Rate (%)
True Positives Rate (%)
BGP speed
63
SWIFT link failure inference is accurate
False Positives Rate (%)
True Positives Rate (%)
fastconvergence
64
1. SWIFT reduces the learning time of a withdrawal from 13s to 2s using a control-plane prediction algorithm
2. SWIFT fast reroutes the affected traffic towards unaffected paths in 40ms using a hierarchical forwarding table
3. SWIFT is deployable on existing routers
SWIFT: Predictive Fast Reroute
67
1. SWIFT reduces the learning time of a withdrawal from 13s to 2s using a control-plane prediction algorithm
2. SWIFT fast reroutes the affected traffic towards unaffected paths in 40ms using a hierarchical forwarding table
3. SWIFT is deployable on existing routers
SWIFT: Predictive Fast Reroute
68
Fast rerouting the traffic upon remote outageshas several challenges
A remote outage can affect any set of prefixes
70
Fast rerouting the traffic upon remote outageshas several challenges
A remote outage can affect any set of prefixes
A remote outage can affect backup paths
71
Fast rerouting the traffic upon remote outageshas several challenges
A remote outage can affect any set of prefixes
and we want to be fast
A remote outage can affect backup paths
72
pos. prefix Pointer
1 1.0.0.0/24 0x1112 1.0.1.0/24 0x111
600K 0x111
… … …
Mapping table
Pointer table
Pointer NH
0x1113 1.0.2.0/24 0x211
Hierarchical Forwarding
port 1
port 2
0x211 3
…
port 3
200.0.0.0/24
To solve these challenges,SWIFT relies on a hierarchical forwarding table
2
73
pos. prefix Pointer
1 1.0.0.0/24 0x1112 1.0.1.0/24 0x111
600K 0x111
… … …
Mapping table
Pointer table
Pointer NH
0x1113 1.0.2.0/24 0x211
2
Hierarchical Forwarding
port 1
port 2
0x211 3
…
port 3
200.0.0.0/24
To solve these challenges,SWIFT relies on a hierarchical forwarding table
74
pos. prefix Pointer
1 1.0.0.0/24 0x1112 1.0.1.0/24 0x111
600K 0x111
… … …
Mapping table
Pointer table
Pointer NH
0x1113 1.0.2.0/24 0x211
2
Hierarchical Forwarding
port 1
port 2
0x211 3
…
port 3
200.0.0.0/24
To solve these challenges,SWIFT relies on a hierarchical forwarding table
75
pos. prefix Pointer
1 1.0.0.0/242 1.0.1.0/24
600K
… … …
Mapping table
Pointer table
Pointer NH
3 1.0.2.0/24
Hierarchical Forwarding
port 1
port 2
port 3
SWIFT uses custom pointers
0xA25BD0xEF12D
0xAFBC3
0xAFBC3
200.0.0.0/24
76
pos. prefix Pointer
1 1.0.0.0/242 1.0.1.0/24
600K
… … …
Mapping table
Pointer table
Pointer NH
3 1.0.2.0/24
Hierarchical Forwarding
port 1
port 2
port 3
A indicates aprimary next-hop
0xA25BD0xEF12D
0xAFBC3
0xAFBC3
200.0.0.0/24
SWIFT uses custom pointers
77
pos. prefix Pointer
1 1.0.0.0/242 1.0.1.0/24
600K
… … …
Mapping table
Pointer table
Pointer NH
3 1.0.2.0/24
Hierarchical Forwarding
port 1
port 2
port 312
A indicates aprimary next-hop
0xA25BD0xEF12D
0xAFBC3
0xAFBC3 0xA****0xE****
200.0.0.0/24
SWIFT uses custom pointers
78
pos. prefix Pointer
1 1.0.0.0/242 1.0.1.0/24
600K
… … …
Mapping table
Pointer table
Pointer NH
3 1.0.2.0/24
Hierarchical Forwarding
port 1
port 2
port 3
indicates theAS path used
12
0xA25BD0xEF12D
0xAFBC3
0xAFBC3 0xA****0xE****
200.0.0.0/24
SWIFT uses custom pointers
79
0xA25BD0xEF12D
0xAFBC3
0xAFBC3
pos. prefix Pointer
1 1.0.0.0/242 1.0.1.0/24
600K
… … …
Mapping table
Pointer table
Pointer NH
3 1.0.2.0/24
Hierarchical Forwarding
port 1
port 2
port 30xA*** 10xE*** 2
indicates this prefixis traversing link (5,6)
200.0.0.0/24
SWIFT uses custom pointers
80
0xAFBC3
pos. prefix Pointer
1 1.0.0.0/242 1.0.1.0/24
600K
… … …
Mapping table
Pointer table
Pointer NH
3 1.0.2.0/24
Hierarchical Forwarding
port 1
port 2
port 30xA*** 10xE*** 2
indicates this prefixis traversing link (6,7)
0xA25BD0xEF12D
0xAFBC3200.0.0.0/24
SWIFT uses custom pointers
81
pos. prefix Pointer
1 1.0.0.0/242 1.0.1.0/24
600K
… … …
Mapping table
Pointer table
Pointer NH
3 1.0.2.0/24
Hierarchical Forwarding
port 1
port 2
port 3
indicates theAS path used
12
0xA25BD0xEF12D
0xAFBC3
0xAFBC3 0xA****0xE****
200.0.0.0/24
SWIFT uses custom pointers
82
pos. prefix Pointer
1 1.0.0.0/242 1.0.1.0/24
600K
… … …
Mapping table
Pointer table
Pointer NH
3 1.0.2.0/24
Hierarchical Forwarding
port 1
port 2
port 3
fast reroutes the traffictraversing the link (5,6)
12
30xA25BD0xEF12D
0xAFBC3
0xAFBC3 0xA****0xE****
0x*F***
200.0.0.0/24
SWIFT uses custom pointers
83
SWIFT reroutes the affected trafficto unaffected backup paths
21SWIFT
5 6
7
8
3
4
10k prefixes
10k prefixes
84
21SWIFT
5 6
7
8
3
4
SWIFT reroutes the affected trafficto unaffected backup paths
BGP may use a disrupted backup path
10k prefixes
10k prefixes
85
21SWIFT
5 6
7
8
3
4
SWIFT can bypassthe failure
BGP may use a disrupted backup path
SWIFT reroutes the affected trafficto unaffected backup paths
10k prefixes
10k prefixes
86
0xA25BD0xEF12D
0xAFBC3
0xAFBC3
pos. prefix Pointer
1 1.0.0.0/242 1.0.1.0/24
600K 200.0.0.0/24
… … …
Mapping table
Pointer table
Pointer NH
3 1.0.2.0/24
Hierarchical Forwarding
port 1
port 2
port 3
indicates thebackup next-hops
0xA**** 10xE**** 2
SWIFT encodes the next-hops in the pointers too
87
0xA25BD0xEF12D
0xAFBC3
0xAFBC3
pos. prefix Pointer
1 1.0.0.0/242 1.0.1.0/24
600K 200.0.0.0/24
… … …
Mapping table
Pointer table
Pointer NH
3 1.0.2.0/24
Hierarchical Forwarding
port 1
port 2
port 30xA**** 10xE**** 2
indicates the backup next-hopto use if (5,6) fails
SWIFT encodes the next-hops in the pointers too
88
0xA25BD0xEF12D
0xAFBC3
0xAFBC3
pos. prefix Pointer
1 1.0.0.0/242 1.0.1.0/24
600K 200.0.0.0/24
… … …
Mapping table
Pointer table
Pointer NH
3 1.0.2.0/24
Hierarchical Forwarding
port 1
port 2
port 30xA**** 10xE**** 2
indicates the backup next-hopto use if (6,7) fails
SWIFT encodes the next-hops in the pointers too
89
0xA25BD0xEF12D
0xAFBC3
0xAFBC3
pos. prefix Pointer
1 1.0.0.0/242 1.0.1.0/24
600K 200.0.0.0/24
… … …
Mapping table
3 1.0.2.0/24
Hierarchical Forwarding
port 1
port 2
port 3
Pointer table
Pointer NH
0xA**** 10xE**** 2
0x*F*3* 3
fast reroutes the traffictowards a valid backup path
SWIFT encodes the next-hops in the pointers too
90
pointer
as-path encoding primary and backup next-hops
48 bits
Pointers are data-plane tags with limited size
92
pointer
as-path encoding primary and backup next-hops
SWIFT only encodes themost important AS links
48 bits
Pointers are data-plane tags with limited size
93
SWIFT as path encoding algorithm enables to reroutemost of the affected prefixes, with few bits only
94
SWIFT as path encoding algorithm enables to reroutemost of the affected prefixes, with few bits only
95
18
median is 99%
SWIFT as path encoding algorithm enables to reroutemost of the affected prefixes, with few bits only
96
pointer
48 bits
as-path encoding primary and backup next-hops
18 bits 30 bits
Pointers are data-plane tags with limited size
97
1. SWIFT reduces the learning time of a withdrawal from 13s to 2s using a control-plane prediction algorithm
2. SWIFT fast reroutes the affected traffic towards unaffected paths in 40ms using a hierarchical forwarding table
3. SWIFT is deployable on existing routers
100
SWIFT: Predictive Fast Reroute
1. SWIFT reduces the learning time of a withdrawal from 13s to 2s using a control-plane prediction algorithm
2. SWIFT fast reroutes the affected traffic towards unaffected paths in 40ms using a hierarchical forwarding table
3. SWIFT is deployable on existing routers
101
SWIFT: Predictive Fast Reroute
SWIFT is deployable on existing routersequipped with a hierarchical forwarding table
103
what if we don’t have one though?
0xA25BD0xEF12D
0xAFBC3
0xAFBC3
pos. prefix Pointer
1 1.0.0.0/242 1.0.1.0/24
600K 200.0.0.0/24
… … …
Mapping table
3 1.0.2.0/24
Hierarchical Forwarding
port 1
port 2
port 3
Pointer table
Pointer NH
0xA*** 10xE*** 2
0x*F*3 3
104
port 1
port 2
port 3
IP Router
SDN switch
0xA25BD0xEF12D
0xAFBC3
0xAFBC3
pos. prefix Pointer
1 1.0.0.0/242 1.0.1.0/24
600K 200.0.0.0/24
… … …
3 1.0.2.0/24
Mapping table
NH
0xA*** 10xE*** 2
0x*F*3 3
Pointer table
Pointer
105
Pointer tableport 1
port 2
port 3
IP Router
SDN switch
0xA25BD0xEF12D
0xAFBC3
0xAFBC3
pos. prefix
1 1.0.0.0/242 1.0.1.0/24
600K 200.0.0.0/24
… … …
3 1.0.2.0/24
Mapping table
NH
0xA*** 10xE*** 2
0x*F*3 3
Dest. MAC
Dest. MAC
106
0 20 40 60 80 100 120 140Time (s)
0
20
40
60
80
100
Pack
et lo
ss (%
)
Failu
re
Time (s)
Packet loss (%)
108
SWIFT reduces the convergence time from ~110s to 2s
0 20 40 60 80 100 120 140Time (s)
0
20
40
60
80
100
Pack
et lo
ss (%
)
Failu
re BGPSWIFT
Time (s)
Packet loss (%)
109
0 20 40 60 80 100 120 140Time (s)
0
20
40
60
80
100
Pack
et lo
ss (%
)
Failu
re BGPSWIFT
Time (s)
Packet loss (%)
98% speed upimprovement
SWIFT reduces the convergence time from ~110s to 2s
110
Speeds up the learning phase by predicting the control-plane85% accuracy early in the burst
Quickly fast reroutes the affected traffic to unaffected pathsfor 99% of the prefixes
Is deployable on existing routers98% speed up improvement
swift.ethz.ch (source code, VM + demo!)
SWIFT: Predictive Fast Reroute
111