View
219
Download
1
Tags:
Embed Size (px)
Citation preview
1
Scheduling Reserved Traffic
in Input-Queued Switches:
New Delay Bounds via Probabilistic Techniques
Milan VojnovićEPFL
Joint work with Matthew Andrews Bell Laboratories, Murray Hill, NJ
LCA Seminars Talk, EPFL, March 27, 2003
2
Introduction: Input-Queued Switch
input ports output ports
......
1
2
3
I I
1
2
...
...
crossbar
At any point in time, connectivity restricted to permutation matrices
3
Some Existing Approaches for Crossbar Scheduling
• maximum-weight matching (McKeown ‘96, many others)
• decomposition-based scheduling (Chang et al, 2000)
• fluid-tracking (Tabatabaee et al, ToN ’01)
4
Decomposition-Based Scheduling
Given: M, a I x I rate demand matrix
[mij] intensity of the service offered to the ij-th input/output port pair
Assume M doubly sub-stochastic
Constraint: crossbar
Find: Decompose M into permutation matrices. Find a schedule such that intensity of the service offered to ij-th input/output port pair is at least [mij]
5
Decomposition-Based Sched. (cont’d)
Observation: A solution to the problem ensures the service rate to be at least M in the long-run
Desired Property: broadly speaking, we want a schedule to be also “smooth” (“non bursty”), that is, the transmission slots would need to be evenly offered to any input-output port pair
Observation: Note, the last is a short-run property
6
A Decomposition: Birkoff/von Neumann
Birkoff/von Neumann (e.g. Chvátal ‘84, p. 330): Any doubly stochastic matrix M is a convex combination of permutation matrices, that is
K
1kkkMM
Mk is a permutation matrix
k is intensity of the k-th permutation matrix
2I2IK 2
Other decompositions can be used for doubly sub-stochastic M;
Birkoff/von Neumann maximizes throughput
Birkoff/von Neumann applied to the switch problem by Chang et al (2000)
7
The Problem that We Study
Given: M1, M2, …, MK a sequence of permutation matrices
Find: schedules with a guarantee on their smoothness
“smooth” quantified through the concept of latency defined shortly
8
Why is the Problem Important
• Rate provision, but also, delay-jitter guarantees for diffserv like EF (Expedited Forwarding), guarantees for MPLS, provision of a good Connection-Reservation-Table to offer guaranteed service to control traffic inside a switch
9
Related Work
When load is not more than 1/4 (Giles and Hajek ‘97) a schedule exists such that each pair ij is scheduled at least once in 1/ij
When load is 1 (Chang et al ‘00) Birkoff/von Neumann decomposition + PGPS scheduling of the decomposition permutation matrices, then a bound exists (shown shortly)
10
Related Work (cont’d)
• Leonardi et al (Infocom’01): a maximum-weight matching switch uniformly loaded with <1 has the mean delay
• Shah and Kopikare (Infocom’02): a switch with bernoulli <1 arrivals and scheduling that at each slots picks permutation matrix uniformly at random over the entire set of I! permutation matrices has the mean delay
) 1 /( ) I( ] W[Eij
) 1 /( )1 I( ] W[Eij Mean-delay results:
11
Content
• Method to Construct Schedules • Latency definition used• Latencies of 4 schedulers: Random-Permutation,
Random-Phase, Random-Distortion, Poisson Competition
• Numerical Examples• Tasting some of the Methods Used to Obtain
Results• Conclusion
12
Method to Construct a Schedule: Superposition of Marked Point
Processes
0
1 intensity
0
0
0
Schedule:
N1:
N2:
NK:
2 intensity
K intensity
K
1kk intensity
N:
1T 2T ...
12 ...
13
Latency of a Schedule
)}Em()T,T[N{ ij1ijmnnij
Latency 1: For any n, m, there exists 0Eij1
Latency 2: For any n, there exists
)}Em()T,T[N:0m{ ij2ijmnnij
0Eij2
Latency 3: There exists
)}Em()T,T[N:0m,0n{ ij3ijmnnij
0Eij3
ijSk
mnnkmnnij )T,T[N:)T,T[N
14
Latency of a Schedule
number of slots offered to the ij-th port pair in [0,m)
mij3E0
mij
)Em( ij3ij
)T,T[N m0ij
15
It is Valuable to have an Input-Output port
Characterized with Rate-Latency
)Em()m(b ij3ijij
• Is a bound on lateness of the slots offered to the ij-th port pair
• It is a strict (rate-latency) service curve • Having an input-output port pair
characterized with a service curve, enables us to use known results from Network Calculus to bound backlog and delay for appropriately characterized arrival traffic
16
Scheduler by Chang et al
PGPStoken arrivals tokens placed
back as new arrivals
)1K|S|
,K
min(Eij
ij
ij
ij3
Initialization: token of type k arrive at k/1
1 to equalelement ij
with matrices perm. ofsubset Sij
ijSk
kij
17
Scheduler by Chang et al (cont’d)
0
1/2 1/3 1/4 1/5 1/1
2/1 2/2 2/3 2/4
0
0
0 K/1 K/2
Schedule:
Tokens 1:
Tokens 2:
Tokens K:
18
Scheduler by Chang et al (cont’d)
The bound of Chang et al is almost tight
One can construct an example that almost attains the bound, see the paper
19
Smooth per-permutation matrix may not mean
smooth per input-output port
• An input-output port pair may be scheduled by more than one permutation matrix
• Aggregate of subset of permutation matrices may be not smoothly scheduled, even though the schedule of permutation matrices is smoothIf each input-output port pair would
have 1 exactly in 1 perm. matrix, then <=> classical polling
20
Random Permutation Scheduler
0
1L/l11 0
0
0
Schedule:
Tokens 1:
Tokens 2:
Tokens K:
1 2 34 5 1l copy from [0,1)
1 2 3 4 2l
1
copy from [0,1)
copy from [0,1)
1
1 2 Kl
L/l22
L/lKK
...
...
...
1
copy from [0,1)
21
Latency of Random Permutation Scheduler
L large ,L1
A~Eij
ijij3
21
e)1Ak4(1k
Ak22 2
Result 1: Fix some 0<<1. With probability 1-
where
(for , the same estimate holds with A=1/2lnij2E
! L~LatencyKK lL
L)1( :caseWorst ij
22
Flavor of a Way to Obtain the Result
}EY{ ij3ij
kL2k1
kL2k1
XminXmax:Y
k)1()T,T[N:X k1k0ijk
W)1(L
YijijL
)t(Binf)t(BsupW 01t001t0d the range of Brownian bridge
definition of the latency 3
period-L
22wk2
1k
22 e)1wk4(2)wW(P
known result
24
Random-Phase Scheduler
0
1/2 1/3 1/4 1/5 1/1
2/1 2/2 2/3 2/4
0
0
0 K/1 K/2
Schedule:
Tokens 1:
Tokens 2:
Tokens K:
1/1
2/1
K/1
11 /U
22 /U
KK /U
1)uniform(0, i.i.d. U,...,U,U K21
25
Random-Phase Scheduler (cont’d)
)1L2ln(K22|S|
Eij
ijij3
Result 2: Assume, intensity of each permutation matrix is an integer number of 1/L. With probability 1,
26
Random-Distortion Scheduler
0
1/2 1/3 1/4 1/5 1/1
2/1 2/2 2/3 2/4
0
0
0 K/1 K/2
Schedule:
Tokens 1:
Tokens 2:
Tokens K:
11,1 /U
21,2 /U
K1,K /U
1)uniform(0, i.i.d. 1,2,...,i , U,...,U,,U i,Ki,2i,1
12,1 /U
22,2 /U
K2,K /U
27
Random-Distortion Scheduler
DlnK22Dln|S|21
E ijij
ij3
Result 3: Assume, intensity of each permutation matrix is an integer number of 1/L. With probability 1,
kk
2
min1I
81D
28
Poisson-Competition Scheduler
)( Poisson~N kk
)1
(1
ln21
Eij
1ij2
)( Bernoulli kAmounts to: at a slot, the permutation matrix is of type k ~
For latency 2:}E}m)1()T,T[N{max{ ij
2ijijmnnij1m
Waiting time of Geo/D/1 queue (known)
Brownian approximation
29
Numerical EvaluationsGoal: Evaluate latencies over a large set of service rate matrices (matrix M
defined earlier)
Algorithm to generate stochastic matricesBegin (k=0): set IxI matrix M such that [mij]=1/L, all ij
Step (k), k=1,…,k0:
• draw i1, j1, i2, j2 uniformly at random on 1,2,…,I
• draw d uniformly at random on [0,min(mi1j1,mi2j2)]
• [mi1j1]<-[mi1j1]-d, [mi2j2]<-[mi2j2]-d,[mi1j2]<-[mi1j2]+d, [mi2j1]<-[mi2j1]+d
Evolution of M is a Markov chainOne perhaps may prefer to generate M uniformly at
random over the space of doubly stochastic matrices
30
Numerical Evaluations: varying switch size
ij3ij
ijEmax
I
Ob.: except for small switch sizes, • the random-phase bound is tighter than PGPS;• the random-distortion bound is tightest
31
Numerical Evaluations: per port- pair latencies for a
64x64 matrix xE s.t. ij of Fraction ij
3ij
x
L=4096K=2423
Ob.:• the fraction is larger for the random-phase than PGPS • for large enough x, the fraction is largest for the random-
distortion
34
Preliminaries
)}Em()T,T[N{G ijmnnijm,n “Good” Event:
Assume: N, 21 R, 43
1mnt
2ns
Result 1: )()st()t,s[N 43ijij
)T,T[)t,s[ & mnn )Em()T,T[N ijmnnij
Eij
4321
35
Preliminaries Cont’d
Result 2:21 s)s,0[N & t)t,0[N
)T,T[)t,s[ mnn
Putting the Pieces Together:
} )()st()t,s[N{
}s)s,0[N{
} t)t,0[N{
43ijij
2
1
m,nG
Gn,m is implied by the events easier to handle
36
Random-phase Scheduler
k,tkk Xt)t,0[N Scheduler def:
ttUk,t kkk1X )1,0(Unif~Uk
ts all ,1)st()t,s[N kk
ts all |,S|)st()t,s[N ijkij
Assume |S|21
ij43
Then
}s)s,0[N{
} t)t,0[N{
2
1
m,nG
Remains only to handle two events
37
Random-phase Scheduler (cont’d)
Note Xt)t,0[N 1tk
kt,1 k
k,tt ]X[E:
Hoeffding K/)(21
21e)t)t,0[N(P
Similarly K/)(21
22e)s)s,0[N(P
L
1s2
L
1t1
ij m,nm,n
)s)s,0[N(P
)t)t,0[N(P1)G(PFinally,
1L2L2
1
sum to L, periodicity
)1L2ln(
2K
:21
> 0
38
Random-phase Scheduler: DERANDOMIZATION
Method of conditional probabilities
Assume events of sequence a A,...,A,A1n21
s-rv of sequence a Y,...,Y,Y2n21
s-rv of array an X,...,X,X2n,i2,i1,i
1n,...,2,1i
]zY,...,zY|)X(f[E
)zY,...,zY|A(P
mm11
n
1kikik
mm11i
2
)x(fx some ,z,...,z,z any ikm21
39
Random-phase Scheduler: DERANDOMIZATION (cont’d)
Result there exist 2n21 y,...,y,y
])X(f[E)yY,...,yY|A(P2
22
n
1kikiknn11i
In addition, if 1])X(f[Ei
n
1kikik
2
2n1i Y,..., Yby determined completely is A
tindependen mutually are Y,...,Y2n1
)x( xsomefor ),Y(X ikkikk,i
1)yY,...,yY|A(Pi
nn11i 22Then
40
Random-phase Scheduler: DERANDOMIZATION (cont’d)
Application to our problem
kk UY
)U(X kikk,i iixik kk1)x(
Hoeffding from ),x(fx ik
}k)k,0[N{A 1k
1L2L2
)s)s,0[N(P
)t)t,0[N(P
L
1s2
L
1t1
We showed
By the method of cond. prob., it follows that the latency holds w.p.1
< 1
41
Conclusion• We showed that one can obtain less pessimistic
bounds on latency that hold in probability• One can derandomize and obtain latencies that hold
with probability 1• In many cases the obtained latencies are better
than a best-known latency• Approach of the Point Processes may be used to
construct other schedulers• Worth to try to obtain sharper results• The question remains: what is the best possible
latency for load larger than 1/4