Upload
mayda
View
23
Download
0
Embed Size (px)
DESCRIPTION
CS 258 Parallel Computer Architecture Lecture 6 Router Design Fault Tolerance. February 8, 2002 Prof John D. Kubiatowicz http://www.cs.berkeley.edu/~kubitron/cs258. Recall: Deadlock Freedom. How can it arise? necessary conditions: shared resource incrementally allocated non-preemptible - PowerPoint PPT Presentation
Citation preview
CS 258 Parallel Computer Architecture
Lecture 6
Router DesignFault Tolerance
February 8, 2002Prof John D. Kubiatowicz
http://www.cs.berkeley.edu/~kubitron/cs258
2/8/02John Kubiatowicz
Slide 6.2
Recall: Deadlock Freedom• How can it arise?
– necessary conditions:» shared resource» incrementally allocated» non-preemptible
– think of a channel as a shared resource that is acquired incrementally
» source buffer then dest. buffer» channels along a route
• How do you avoid it?– constrain how channel resources are allocated– ex: dimension order
• How do you prove that a routing algorithm is deadlock free
2/8/02John Kubiatowicz
Slide 6.3
Recall: Channel Dependence Graph
1 2 3
01200 01 02 03
10 11 12 13
20 21 22 23
30 31 32 33
17
18
1916
17
18
1 2 3
012
1718 1718 1718 1718
1 2 3
012
1817 1817 1817 1817
1 2 3
012
1916 1916 1916 1916
1 2 3
012
2/8/02John Kubiatowicz
Slide 6.4
Turn Restrictions in X,Y
• XY routing forbids 4 of 8 turns and leaves no room for adaptive routing
• Can you allow more turns and still be deadlock free
+Y
-Y
+X-X
2/8/02John Kubiatowicz
Slide 6.5
Minimal turn restrictions in 2D
West-first
north-last negative first
-x +x
+y
-y
2/8/02John Kubiatowicz
Slide 6.6
Example legal west-first routes
• Can route around failures or congestion• Can combine turn restrictions with virtual
channels
2/8/02John Kubiatowicz
Slide 6.7
Adaptive Routing• R: C x N x -> C• Essential for fault tolerance
– at least multipath• Can improve utilization of the network• Simple deterministic algorithms easily run into
bad permutations
• fully/partially adaptive, minimal/non-minimal• can introduce complexity or anomolies• little adaptation goes a long way!
2/8/02John Kubiatowicz
Slide 6.8
Recall: Virtual channels
Packet switchesfrom lo to hi channel
2/8/02John Kubiatowicz
Slide 6.9
Switch Design
• Flow control?• Routing Control?• Fault tolerance?• Retransmission?
Cross-bar
InputBuffer
Control
OutputPorts
Input Receiver Transmiter
Ports
Routing, Scheduling
OutputBuffer
2/8/02John Kubiatowicz
Slide 6.10
Other Adaptive Algorithms• Adaptive in all dimensions
– Linder and Harden paper– Requires 2d-1 virtual channels/physical link
• Planar Adaptive routing (Chien and Kim)– Take 2 dimensions at a time and adapt within them:
» (X,Y), then (Y,Z), then (Z,W), etc.– Many of advantages of the fully adaptive, much less
hardware!
2/8/02John Kubiatowicz
Slide 6.11
How do you build a crossbar
Io
I1
I2
I3
Io I1 I2 I3
O0
Oi
O2
O3
RAMphase
O0
OiO2O3
DoutDin
Io
I1I2I3
addr
2/8/02John Kubiatowicz
Slide 6.12
Input buffered swtich
• Independent routing logic per input– FSM
• Scheduler logic arbitrates each output– priority, FIFO, random
• Head-of-line blocking problem
Cross-bar
OutputPorts
Input Ports
Scheduling
R0
R1
R2
R3
2/8/02John Kubiatowicz
Slide 6.13
Output Buffered Switch
• How would you build a shared pool?
Control
OutputPorts
Input Ports
OutputPorts
OutputPorts
OutputPorts
R0
R1
R2
R3
2/8/02John Kubiatowicz
Slide 6.14
Example: IBM SP vulcan switch
• Many gigabit ethernet switches use similar design without the cut-through
FIFO
CRCcheck
Routecontrol
FlowControl
8 8
Des
eria
lizer
64
Input Port
RAM64x128
InArb
OutArb
8 x 8Crossbar
CentralQueue
FIFO
CRCGen
FlowControl
8 8Seria
lizer
64
Ouput Port
XBarArb
FIFO
CRCcheck
Routecontrol
FlowControl
8 8
Des
eria
lizerInput Port
°°°
64
°°°
FIFO
CRCGen
FlowControl
8 8Seria
lizer
Ouput Port
XBarArb
8
°°°
8
2/8/02John Kubiatowicz
Slide 6.15
Output scheduling
• n independent arbitration problems?– static priority, random, round-robin
• simplifications due to routing algorithm?• general case is max bipartite matching
Cross-bar
OutputPorts
R0
R1
R2
R3
O0
O1
O2
InputBuffers
2/8/02John Kubiatowicz
Slide 6.16
Stacked Dimension Switches• Dimension order
on 3D cube?
• Cube connected cycles?
Host Out
Host In
Xin
Yin
Zin
Xout
Yout
Zout
2x2
2x2
2x2
2/8/02John Kubiatowicz
Slide 6.17
Flow Control• What do you do when push comes to
shove?– ethernet: collision detection and retry after delay– FDDI, token ring: arbitration token– TCP/WAN: buffer, drop, adjust rate– any solution must adjust to output rate
• Link-level flow control
Data
Ready
2/8/02John Kubiatowicz
Slide 6.18
Examples• Short Links
• long links– several flits on the wire
Sour
ce
Des
tinat
ion
Data
Req
Ready/AckF/E F/E
2/8/02John Kubiatowicz
Slide 6.19
Smoothing the flow
• How much slack do you need to maximize bandwidth?
LowMark
HighMark
Empty
Full
Stop
Go
Incoming Phits
Outgoing Phits
Flow-control Symbols
2/8/02John Kubiatowicz
Slide 6.20
Link vs global flow control• Hot Spots• Global communication operations• Natural parallel program dependences
2/8/02John Kubiatowicz
Slide 6.21
Reliable Router• Bidirectional signaling (Conserve pins)
– Current-mode drive– Transmitter subtracts its injection of current to
compute receiver’s value• Clock synchronization: Pleisiochronous
– Clocks “almost” same frequency at all nodes– By sending occasional dummy cells, transmit rate <
receive rate• Deadlock: 5 virtual channels
– 2 minimal-adaptive (always route toward destination)– 2 dimension ordered (i.e. 2 priorities of channel)– 1 non-minimal turn-restricted channel
• Error Handling– “Unique Token protocol”– Every flit in two places at a time– Flow control maintains this invariant
2/8/02John Kubiatowicz
Slide 6.22
Example: T3D
• 3D bidirectional torus, dimension order (NIC selected), virtual cut-through, packet sw.
• 16 bit x 150 MHz, short, wide, synch.• rotating priority per output• logically separate request/response• 3 independent, stacked switches• 8 16-bit flits on each of 4 VC in each directions
Ro ute TagDest PECo mmand
Ro ute TagDest PECo mmand
Route TagDest PECommand
Route TagDest PEComman d
Route TagDest PECommand
R oute T agD est PEC ommand
R oute TagD est PEC ommand
R ead Req - no cache - cache - prefetch - fetch&in c
Ad dr 0Ad dr 1Src PE
R ead Resp Read Resp - cached
Word 0 Word 0Word 1Word 2Word 3
Write Req - Proc - BLT 1 - fetch&inc
Addr 0Addr 1Src PEWord 0
Addr 0Addr 1Src PEWord 0Word 1Word 2Word 3
Write Req - proc 4 - BLT 4
Write Resp
A ddr 0A ddr 1Src PEA ddr 0A ddr 1
B LT R ead Req
Packet Type req/resp coomand3 1 8
2/8/02John Kubiatowicz
Slide 6.23
Example: SP
• 8-port switch, 40 MB/s per link, 8-bit phit, 16-bit flit, single 40 MHz clock
• packet sw, cut-through, no virtual channel, source-based routing
• variable packet <= 255 bytes, 31 byte fifo per input, 7 bytes per output, 16 phit links
• 128 8-byte ‘chunks’ in central queue, LRU per output• run in shadow mode
P0P1P2P3 P15
E0E1E2E3 E15
Intra-Rack Host Ports
Inter-Rack External Switch Ports
16-node Rack
SwitchBoard
Multi-rack Configuration
2/8/02John Kubiatowicz
Slide 6.24
Summary• Routing Algorithms restrict the set of routes
within the topology– simple mechanism selects turn at each hop– arithmetic, selection, lookup
• Deadlock-free if channel dependence graph is acyclic– limit turns to eliminate dependences– add separate channel resources to break dependences– combination of topology, algorithm, and switch design
• Deterministic vs adaptive routing• Switch design issues
– input/output/pooled buffering, routing logic, selection logic
• Flow control• Real networks are a ‘package’ of design
choices