37
L16-1 MIT 6.823 Fall 2021 On-Chip Networks II: Router Microarchitecture & Routing Daniel Sanchez Computer Science & Artificial Intelligence Lab M.I.T. November 3, 2021

On-Chip Networks I: Topology/Flow Control

  • Upload
    others

  • View
    3

  • Download
    0

Embed Size (px)

Citation preview

Page 1: On-Chip Networks I: Topology/Flow Control

L16-1MIT 6.823 Fall 2021

On-Chip Networks II: Router Microarchitecture & Routing

Daniel SanchezComputer Science & Artificial Intelligence Lab

M.I.T.

November 3, 2021

Page 2: On-Chip Networks I: Topology/Flow Control

MIT 6.823 Fall 2021

Recap: Wormhole Flow Control

• Each router manages buffers in flits

• Each packet is sent through output link as soon as possible (without waiting for all its flits to arrive)

• Router buffers are not large enough to hold full packet on congestion, packet’s flits often buffered across routers

• Problem: On congestion, links assigned to a blocked packet cannot be used by other packets

November 3, 2021

blocked

BA B

Wormhole

idle

L16-2

Page 3: On-Chip Networks I: Topology/Flow Control

MIT 6.823 Fall 2021

Recap: Virtual-Channel Flow Control

• When a packet blocks, instead of holding on to channel, hold on to virtual channel

• Virtual channel (VC) = channel state + flit buffers

• Multiple virtual channels reduce blocking

• Ex: Wormhole (=1 VC/channel) vs 2 VCs/channel

November 3, 2021

BA

AB A

VC flow controlwith 2 VCs/channel

blocked

VCs

L16-3

Page 4: On-Chip Networks I: Topology/Flow Control

MIT 6.823 Fall 2021

Time-Space View: Virtual-Channel

November 3, 2021

# flits in VC buffer

L16-4

Page 5: On-Chip Networks I: Topology/Flow Control

MIT 6.823 Fall 2021

Interconnection Network Architecture • Topology: How to connect the nodes up?

(processors, memories, router line cards, …)

• Routing: Which path should a message take?

• Flow control: How is the message actually forwarded from source to destination?

• Router microarchitecture: How to build the routers?

• Link microarchitecture: How to build the links?

November 3, 2021 L16-5

Page 6: On-Chip Networks I: Topology/Flow Control

L16-6MIT 6.823 Fall 2021

Router

Microarchitecture

November 3, 2021

Page 7: On-Chip Networks I: Topology/Flow Control

MIT 6.823 Fall 2021

Ring-based Interconnect

November 3, 2021

P

CP

C

P C

PC

L16-7

Page 8: On-Chip Networks I: Topology/Flow Control

MIT 6.823 Fall 2021

Ring Stop

November 3, 2021

Latch

Output

Input Allow input ifno traffic on ring

If there is traffic on ring, should traffic on ring or new input get priority?

L16-8

Page 9: On-Chip Networks I: Topology/Flow Control

MIT 6.823 Fall 2021

Ring Flow Control: Priorities

November 3, 2021

Rotary Rule – traffic in ring has priorityL16-9

Page 10: On-Chip Networks I: Topology/Flow Control

MIT 6.823 Fall 2021

Ring Flow Control: Bounces

• What if traffic on the ring cannot get delivered, e.g., if output FIFO is full?

• One alternative: Continue on ring (bounce)

• What are the consequences of such bounces?

November 3, 2021 L16-10

Page 11: On-Chip Networks I: Topology/Flow Control

MIT 6.823 Fall 2021

General InterconnectTilera, Knights Landing…

November 3, 2021 L16-11

Page 12: On-Chip Networks I: Topology/Flow Control

MIT 6.823 Fall 2021

What’s In A Router?• It’s a system as well

– Logic – State machines, Arbiters, Allocators• Control data movement through router

• Idle, Routing, Waiting for resources, Active

– Memory – Buffers• Store flits before forwarding them

• SRAMs, registers, processor memory

– Communication – Switches• Transfer flits from input to output ports

• Crossbars, multiple crossbars, fully-connected, bus

November 3, 2021 L16-12

Page 13: On-Chip Networks I: Topology/Flow Control

MIT 6.823 Fall 2021

Virtual-channel Router

November 3, 2021

Flit

L16-13

Page 14: On-Chip Networks I: Topology/Flow Control

MIT 6.823 Fall 2021

Router Pipeline vs. Processor Pipeline

• Logical stages:– BW– RC– VA– SA– BR– ST– LT

• Different flits go through different stages

• Different routers have different variants– E.g. speculation,

lookaheads, bypassing

• Different implementations of each pipeline stage

• Logical stages:– IF– ID– EX– MEM– WB

• Different instructions go through different stages

• Different processors have different variants– E.g. speculation, ISA

• Different implementations of each pipeline stage

November 3, 2021 L16-14

Page 15: On-Chip Networks I: Topology/Flow Control

MIT 6.823 Fall 2021

Baseline Router Pipeline

• Route computation performed once per packet

• Virtual channel allocated once per packet

• Body and tail flits inherit this info from head flit

November 3, 2021

BW RC VA SA ST LT

BW

BW

BW

SA ST LT

SA ST LT

SA ST LT

Head

Body 1

Body 2

Tail

L16-15

Page 16: On-Chip Networks I: Topology/Flow Control

MIT 6.823 Fall 2021

Allocators In Routers

• VC Allocator– Input VCs requesting for a range of output VCs

– Example: A packet of VC0 arrives at East input port. It’s destined for west output port, and would like to get any of the VCs of that output port.

• Switch Allocator– Input VCs of an input port request for different output ports

(e.g., One’s going North, another’s going West)

• “Greedy” algorithms used for efficiency

• What happens if allocation fails on a given cycle?

November 3, 2021 L16-16

Page 17: On-Chip Networks I: Topology/Flow Control

MIT 6.823 Fall 2021

VC & Switch Allocation Stalls

November 3, 2021 L16-17

Page 18: On-Chip Networks I: Topology/Flow Control

MIT 6.823 Fall 2021

Pipeline Optimizations: LookaheadRouting [Galles, SGI Spider Chip]

• At current router, perform route computation for next router

– Head flit already carries output port for next router

– RC just has to read output fast, can be overlapped with BW

– Precomputing route allows flits to compete for VCs immediately after BW

– Routing computation for the next hop (NRC) can be computed in parallel with VA

• Or simplify RC (e.g., X-Y routing is very fast)

November 3, 2021

BWRC

VANRC

SA ST LT

L16-18

Page 19: On-Chip Networks I: Topology/Flow Control

MIT 6.823 Fall 2021

Pipeline Optimizations: Speculative Switch Allocation [Peh & Dally, 2001]

• Assume that Virtual Channel Allocation stage will be successful– Valid under low to moderate loads

• If both successful, VA and SA are done in parallel

• If VA unsuccessful (no virtual channel returned)– Must repeat VA/SA in next cycle

• Prioritize non-speculative SA requests

November 3, 2021

BWRC

VASA

ST LT

L16-19

Page 20: On-Chip Networks I: Topology/Flow Control

L16-20MIT 6.823 Fall 2021

Routing

November 3, 2021

Page 21: On-Chip Networks I: Topology/Flow Control

MIT 6.823 Fall 2021

Properties of Routing Algorithms

• Deterministic/Oblivious– route determined by (source, dest),

– not intermediate state (i.e. traffic)

• Adaptive– route influenced by traffic along the way

• Minimal– only selects shortest paths

• Deadlock-free– no traffic pattern can lead to a situation where no packets move

forward

November 3, 2021 L16-21

Page 22: On-Chip Networks I: Topology/Flow Control

MIT 6.823 Fall 2021

Network Deadlock

• Flow A holds u and v but cannot make progress until it acquires channel w

• Flow B holds channels w and x but cannot make progress until it acquires channel u

November 3, 2021

0 1

23

u

v

w

x AB

L16-22

Page 23: On-Chip Networks I: Topology/Flow Control

MIT 6.823 Fall 2021

Dimension-Order Routing

November 3, 2021

XY-order YX-order

SA

Dc

DA DB

SB Sc SBSA

DB

Dc

DA

Sc

Uses 2 out of 4 turns Uses 2 out of 4 turns

XY is deadlock free, YX is deadlock free, what about XY+YX?

L16-23

Page 24: On-Chip Networks I: Topology/Flow Control

MIT 6.823 Fall 2021

DOR – Turns allowed

November 3, 2021

• One way of looking at whether a routing algorithm is deadlock-free is to look at the turns allowed

• Deadlocks may occur if turns can form a cycle

XY Model YX Model

L16-24

Page 25: On-Chip Networks I: Topology/Flow Control

MIT 6.823 Fall 2021

Allowing more turns

November 3, 2021

• Allowing more turns may allow adaptive routing, but also deadlock

Six turn model

L16-25

Page 26: On-Chip Networks I: Topology/Flow Control

MIT 6.823 Fall 2021

Turn Model [Glass and Ni, 1994]

November 3, 2021

• A systematic way of generating deadlock-free routes with small number of prohibited turns

• Deadlock-free if routes conform to at least ONE of the turn models (acyclic channel dependence graph)

West-First Turn Model North-Last Turn Model

L16-26

Page 27: On-Chip Networks I: Topology/Flow Control

MIT 6.823 Fall 2021

F

A B

E D

C

Vertices in the CDG represent network links

Can create a channel dependency graph (CDG) of thenetwork.

Disallowing180o turns, e.g.,AB BA

2-D Mesh and CDG

November 3, 2021 L16-27

Page 28: On-Chip Networks I: Topology/Flow Control

MIT 6.823 Fall 2021

Cycles in CDG

November 3, 2021

The channel dependency graph D derived from thenetwork topology may contain many cycles

F

A B

E D

C

Flow routed through links AB, BE, EFFlow routed through links EF, FA, ABDeadlock!

L16-28

Page 29: On-Chip Networks I: Topology/Flow Control

MIT 6.823 Fall 2021

Key Insight

November 3, 2021

If routes of flows conform to acyclic CDG, then therewill be no possibility of deadlock!

F

A B

E D

C

Disallow/Delete certain edgesin CDG

Edges in CDG correspond toturns in network!

L16-29

Page 30: On-Chip Networks I: Topology/Flow Control

MIT 6.823 Fall 2021

F

A B

E D

C

Turns could be prohibited ad-hoc, all the edges in red are deleted

Ad-hoc AcyclicCDG

Acyclic CDG Deadlock-free routes

November 3, 2021 L16-30

Page 31: On-Chip Networks I: Topology/Flow Control

MIT 6.823 Fall 2021

West-first Deadlock-free routes

November 3, 2021

Per the West-Firstprohibited turns, all the edges in red are deleted

West-First

Acyclic

CDG

F

A B

E D

C

L16-31

Page 32: On-Chip Networks I: Topology/Flow Control

MIT 6.823 Fall 2021

Resource Conflicts Deadlock

November 3, 2021

0 1

23

u

v

w

x AB

Routing deadlocks in wormhole routing result fromStructural hazard at router resources, e.g., buffers.

How can structural hazards be avoided?

L16-32

Page 33: On-Chip Networks I: Topology/Flow Control

MIT 6.823 Fall 2021

Virtual Channels

• Virtual channels can be used to avoid deadlock by restricting VC allocation

November 3, 2021

F

A B

E D

C

L16-33

Page 34: On-Chip Networks I: Topology/Flow Control

MIT 6.823 Fall 2021

CDG and Virtual Channels

November 3, 2021

F

A B

E

AF0 AF1

FE0 FE1

EB0 EB1

BA0 BA1

FA0 FA1

EF0 EF1

BE0 BE1

AB0 AB1

L16-34

Page 35: On-Chip Networks I: Topology/Flow Control

MIT 6.823 Fall 2021

Randomized Routing: Valiant

• Route each packet through a randomly chosen intermediate node

November 3, 2021

SA

Dc

DA DB

SB Sc

A packet, going from node SA

to node DA, is first routed from SA to a randomly chosen intermediate node IA, before going from IA to final destination DA.

It helps load-balance the network and has a good worst-case performance at the expense of locality.

IA

L16-35

Page 36: On-Chip Networks I: Topology/Flow Control

MIT 6.823 Fall 2021

ROMM: Randomized, Oblivious Multi-phase Minimal Routing

November 3, 2021

SA

Dc

DA DB

SB Sc

To retain locality, choose intermediate node in the minimal quadrant

Equivalent to randomly selecting among the various minimal paths from source to destination

L16-36

Page 37: On-Chip Networks I: Topology/Flow Control

L16-37MIT 6.823 Fall 2021

Thank you!

Next Lecture: VLIW

November 3, 2021