44
Buffered Crossbars With Performance Guarantees Shang-Tse (Da) Chuang Department of Electrical Engineering, Stanford University, http://yuba.stanford.edu/~stchuang EE384Y Thursday, April 29, 2004

Buffered Crossbars With Performance Guarantees

  • Upload
    madge

  • View
    39

  • Download
    0

Embed Size (px)

DESCRIPTION

Buffered Crossbars With Performance Guarantees. EE384Y Thursday, April 29, 2004. Shang-Tse (Da) Chuang Department of Electrical Engineering, Stanford University, http://yuba.stanford.edu/~stchuang. Motivation. Network operators want performance guarantees Throughput guarantee - PowerPoint PPT Presentation

Citation preview

Page 1: Buffered Crossbars With Performance Guarantees

Buffered CrossbarsWith Performance Guarantees

Shang-Tse (Da) ChuangDepartment of Electrical Engineering,Stanford University, http://yuba.stanford.edu/~stchuang

EE384YThursday, April 29, 2004

Page 2: Buffered Crossbars With Performance Guarantees

2

Motivation

Network operators want performance guarantees Throughput guarantee Delay guarantee

High performance routers use crossbars

Hard to build crossbar-based routers with guarantees

My talk: How a crossbar with a small amount of internal

buffering can give guarantees

Page 3: Buffered Crossbars With Performance Guarantees

3

Contents

Throughput Guarantees Buffered Crossbar - 100% Throughput Buffered Crossbar - Work Conservation

Delay Guarantees Traditional Crossbar – Emulating an OQ Switch Buffered Crossbar – Emulating an OQ Switch

Page 4: Buffered Crossbars With Performance Guarantees

4

Generic Crossbar-Based Architecture

Speedup of S

Scheduler

VOQs

Page 5: Buffered Crossbars With Performance Guarantees

5

Admissible Traffic

1 , , j

iji

ij

Traffic Matrix

Traffic is admissible if

Page 6: Buffered Crossbars With Performance Guarantees

6

100% Throughput An algorithm delivers 100% throughput if for any

admissible traffic the average backlog is finite

Throughput Guarantee

Speedup of S

Scheduler

Page 7: Buffered Crossbars With Performance Guarantees

7

Previous Work

1985 1990 1995 2000 2005

Wave Front Arbiter [Tamir]

Parallel Iterative Matching [Anderson et al.]

iSLIP [McKeown]

Longest Port First [Mekkittikul et al.]

Maximum Weight Matching [McKeown et al.]

Maximal Matching S=2[Dai,Prabhakar]

Heuristics

TheoreticallyProven

Page 8: Buffered Crossbars With Performance Guarantees

8

Maximal Matching Has Become Hard

TTX Switch Fabric Uses maximal matching Speedup less than 2 Consumes up to 8kW Limited to ~2.5Tb/s No 100% throughput guarantee

Page 9: Buffered Crossbars With Performance Guarantees

9

Traditional Crossbar

Crossbar Requirements An input can send at most one cell An output can receive at most one cell

Scheduling Problem Must overcome two constraints simultaneously

New Crossbar Relieve contention Remove dependency between inputs and outputs

Page 10: Buffered Crossbars With Performance Guarantees

10

Contents

Throughput Guarantees Buffered Crossbar - 100% Throughput Buffered Crossbar - Work Conservation

Delay Guarantees Traditional Crossbar – Emulating an OQ Switch Buffered Crossbar – Emulating an OQ Switch

Page 11: Buffered Crossbars With Performance Guarantees

11

Buffered Crossbar

Arrival Phase Scheduling Phases – Speedup of 2 Departure Phase

Page 12: Buffered Crossbars With Performance Guarantees

12

Scheduling Phase

Input Schedule Each input selects in parallel a cell for an empty crosspoint

Output Schedule Each output selects in parallel a cell from a full crosspoint

Page 13: Buffered Crossbars With Performance Guarantees

13

Example of Input/Output Scheduling

Round-robin Policy Each input schedules in a round-robin order Each output schedules in a round-robin order

Page 14: Buffered Crossbars With Performance Guarantees

14

Previous Work

Buffered Crossbar Simulations [Rojas-Cessa et al. 2001] 32x32 switch, Uniform Bernoulli Traffic, Round-Robin, S=1

0.01

0.1

1

10

100

1000

0.025 0.125 0.225 0.325 0.425 0.525 0.625 0.725 0.825 0.925

Offered Load p

Ave

rag

e D

elay

(C

ell

Tim

e)

1-SLIP

4-SLIP

Buffered Crossbar

Ideal Router

Page 15: Buffered Crossbars With Performance Guarantees

15

Theorem 1 A buffered crossbar with speedup of 2 delivers 100%

throughput for any admissible Bernoulli iid traffic using any work-conserving input/output schedules.

100% Throughput

Page 16: Buffered Crossbars With Performance Guarantees

16

Intuition of Proof

ε

<1-ε

<1-ε

1 2

1-ε 1-ε+ + ε = 2- ε

When a flow is backed up, the services for this backlog exceeds the arrivals

Page 17: Buffered Crossbars With Performance Guarantees

17

Intuition of ProofQij = Queue Length 0 if buffer empty

1 if buffer fullBij =

j

mjmjk

ikij BQQX

Page 18: Buffered Crossbars With Performance Guarantees

18

Intuition of Proof

Recall

If Qij > 0, then for Xij, Expected increase is 2 Expected decrease

If Bij = 1, then in output schedule one B*j will decrease

If Bij = 0,then in input schedule one Qi* will decrease

Thus expected decrease is 2

j

mjmjk

ikij BQQX

Page 19: Buffered Crossbars With Performance Guarantees

19

Contents

Throughput Guarantees Buffered Crossbar - 100% Throughput Buffered Crossbar - Work Conservation

Delay Guarantees Traditional Crossbar – Emulating an OQ Switch Buffered Crossbar – Emulating an OQ Switch

Page 20: Buffered Crossbars With Performance Guarantees

20

Work-conserving Property If there is a cell for a given output in the system, that

output is busy.

Work Conservation

Output Queued (OQ) Switch

Page 21: Buffered Crossbars With Performance Guarantees

21

?

Emulating an OQ switch

Under identical inputs, the departure time of every cell from both switches is identical

Page 22: Buffered Crossbars With Performance Guarantees

22

4

Input Priority List

57 6

56

1

1

2

9

2

3

8 3

1

Label each cell with their corresponding departure times Arrange input cells into an input priority list Output selects crosspoint with earliest departure time

4

Page 23: Buffered Crossbars With Performance Guarantees

23

Input Priority List

57 6

56

4

132

9

4

2

13

1

8

2

Good guy

Bad guysBad guy

Label each cell with their corresponding departure times Arrange input cells into an input priority list Output selects crosspoint with earliest departure time

Page 24: Buffered Crossbars With Performance Guarantees

24

Definitions

57 6

56

2

4

132

9

4

2

13

Output Margin – cells at its output with earlier departure time Input Margin – cells ahead in input priority list destined to

different outputs Total Margin – Output Margin minus Input Margin

1

8

2 good guys2 bad guys

Page 25: Buffered Crossbars With Performance Guarantees

25

Emulation of FIFO OQ Switch

57 6

56

2

4

12

9

4

2

13

Scheduling Phase Crosspoint is full – Output Margin will increase by one Crosspoint is empty – Input Margin will decrease by one

Total Margin increases by two

1

8 3

Page 26: Buffered Crossbars With Performance Guarantees

26

Emulation of FIFO OQ Switch

57 6

56

2

4

12

9

4

2

13

Arrival Phase Input Margin might increase by one

Departure Phase Output Margin will decrease by one

Total Margin decreases by at most two

1

8 3

3

Page 27: Buffered Crossbars With Performance Guarantees

27

Emulation of FIFO OQ Switch

57 6

56

2

4

2

9

4

2

3

8 33

Lemma 1 For every time slot, total margin does not decrease

Page 28: Buffered Crossbars With Performance Guarantees

28

FIFO Insertion Policy

56

4

2

9

4

2

3

857 6 323

47

Arrival Phase Cell for non-empty VOQ, insert behind cells for same

output Cell for empty VOQ, insert at head of input priority list

Page 29: Buffered Crossbars With Performance Guarantees

29

FIFO Insertion Policy

57 6

56

2

4

2

9

4

2

3

8 33

Lemma 2 An arriving cell will have a non-negative total margin

4 7

Page 30: Buffered Crossbars With Performance Guarantees

30

Theorem 2 A buffered crossbar with speedup of 2 can exactly emulate a

FIFO OQ switch.

Result was shown independently B. Magill, C. Rohrs, R. Stevenson, “Output-Queued Switch

Emulation by Fabrics With Limited Memory”, in IEEE Journal on Selected Areas in Communications, pp.606-615, May. 2003.

Theorem 3 A buffered crossbar with speedup of 2 can be work-conserving

with a distributed algorithm.

Emulation of FIFO OQ Switch

Page 31: Buffered Crossbars With Performance Guarantees

31

Contents

Throughput Guarantees Buffered Crossbar - 100% Throughput Buffered Crossbar - Work Conservation

Delay Guarantees Traditional Crossbar – Emulating an OQ Switch Buffered Crossbar – Emulating an OQ Switch

Page 32: Buffered Crossbars With Performance Guarantees

32

Delay Guarantees

one output, many logical FIFO queues

1

m

1 Weighted fair queueing

sorts packetsconstrained traffic

PIFO models

Weighted Fair Queueing Weighted Round Robin Strict priority etc.

one output, single PIFO queue

Push In First Out (PIFO)

1 constrained traffic

push

Page 33: Buffered Crossbars With Performance Guarantees

33

Achieving Delay Guarantees in Crossbars

Theorem 4 A crossbar switch with a speedup of 2 can exactly

emulate an OQ switch which provides delay guarantees.

Theorem 5 A crossbar switch with a speedup of 2-1/N is

necessary and sufficient to exactly emulate an NxN FIFO OQ switch.

Page 34: Buffered Crossbars With Performance Guarantees

34

Contents

Throughput Guarantees Buffered Crossbar - 100% Throughput Buffered Crossbar - Work Conservation

Delay Guarantees Traditional Crossbar – Emulating an OQ Switch Buffered Crossbar – Emulating an OQ Switch

Page 35: Buffered Crossbars With Performance Guarantees

35

3

Emulation of PIFO OQ Switch

57 6

56

2

4

1

9

4

2

12

Crosspoint Blocking A cell in the crosspoint has a larger departure time

Swap Phase If an arriving cell has a smaller departure time than the cell

in the crosspoint, swap the two cells

1

8 3

67

5 3

2

1

4

Page 36: Buffered Crossbars With Performance Guarantees

36

1

35

67

PIFO Insertion Policy

57 6 3 1

9

4

2

1

1

8 3

2

Arrival Phase Insert cell directly behind cell with departure time just earlier If cell has earliest departure time, then insert at head of input

priority list

42

4

2

3

15

Page 37: Buffered Crossbars With Performance Guarantees

37

Theorem 6 A buffered crossbar with speedup of 3 can exactly

emulate an OQ switch with delay guarantees.

PIFO Emulation

Page 38: Buffered Crossbars With Performance Guarantees

38

Output Linecard

Header Scheduling Architecture

Buffered Crossbar

Input Linecard

HeadersGrants

HeaderScheduler

Page 39: Buffered Crossbars With Performance Guarantees

39

Header Scheduling

2

9

4 3

Schedule headers instead of cells Headers are converted into grants in output schedule Grants are sent back to the input

1

18 3

1

42

25

56

367

4

2

2

2

Page 40: Buffered Crossbars With Performance Guarantees

40

Output Linecard

Grant Stream

Buffered Crossbar

Input Linecard

HeadersGrants

GrantFIFO

HeaderScheduler

Input can receive N grants in one scheduling phase Bounded to p+N-1 grants over p consecutive phases

Page 41: Buffered Crossbars With Performance Guarantees

41

33

33

3

3

3

3

22

1

1

1

2

1

Counter Example

GrantFIFO

Crosspoints

Grants

p=1

p=2

p=3

p=4

p=5

p=6

1

2

3

2 3

Cells ToOutput Queue

1 2 3

3

333

3

Page 42: Buffered Crossbars With Performance Guarantees

42

Modified Buffered Crossbar

Modified Buffered Crossbar N cells per crosspoint – requires N3 cell buffers N cells per output – requires N2 cell buffers

Theorem 7 A modified buffered crossbar with speedup of 2 can

emulate an OQ switch with delay guarantees with a fixed delay of N scheduling phases.

Page 43: Buffered Crossbars With Performance Guarantees

43

Summary

Buffered crossbars Uses crosspoints to relieve contention Inputs and outputs schedule independently and in

parallel

Performance guarantees Throughput – any work-conserving input/output

schedule Work Conservation – simple insertion policy Delay – header scheduling

Page 44: Buffered Crossbars With Performance Guarantees

44

Relevant Papers

Crossbars Shang-Tse Chuang, Ashish Goel, Nick McKeown,

Balaji Prabhakar, “Matching Output Queuing with a Combined Input Output Queued Switch,” IEEE Journal on Selected Areas in Communications, vol.17, n.6, pp.1030-1039, Dec.1999.

Buffered Crossbars Shang-Tse Chuang, Sundar Iyer, Nick McKeown,

“Practical Algorithms for Performance Guarantees in Buffered Crossbars,” Stanford HPNG Technical Report TR03-HPNG-061501 .