Arteris network on chip: The growing cost of wires

November 9-11, 2010The Santa Clara Convention Center

www.armtechcon.com

P&R congestion is the focus of EDAP&R congestion is the focus of EDA

"...upstream tools need to be claivoyant deep into the layout."

"The worst crises are when you're deep into the layout and realize that my floorplan's no good. So how do you avoid that? Well what's needed are claivoyant tools. That is a chain of steps where each step already knows a little bit about the changes downstream."

"The synthesizer can, this year, avoid congestion; and congestion is really the killer of schedules.“

-Aart DeGeus, Synopsys Synposium 2010

http://www.arm.com/index.php

Interconnect

Interconnects logicallyInterconnects logically

The interconnect transports AXI transactions between masters and slaves. The means of transportation are not defined by the AXI spec.

master

master

master

master

master

slave

slave

slave

slave

slave

AXI AXI


Interconnect physicallyInterconnect physically

The interconnect lives in the hallways between IP cores.The width of the links affects the compactness of the die.


1. Growing interface complexity1. Growing interface complexity

Data width 32 64 128

AHB signals 113 177 305

AXI signals 204 272 408

Address

Write data

Read data

Write address

Write data

Read data

Read address

Write response

32

Data width

Data width

32

Data width

Data width

32

Control A few

WiresWires

Control A few

Control A few

Control A few

Control A few

A few

AHBAXI

SignalSignal


2. More interfaces each year2. More interfaces each year

cores

connections

0

10

20

30

40

50

60

70

1 2 3 4 5 6 7 8 9 10


3. Relative wire cost growing3. Relative wire cost growing

Transistor sizes shrink faster than wire widths.

286 CPU (1982)69 mm2

Atom N450 (2010)66 mm2

Chips are, on average, the same size as ever.


The growing cost of wiresThe growing cost of wires

1.More wires per interface

2. More interfaces to connect

3. Relative wire cost growing


Packetizing AXI to transport transactionsPacketizing AXI to transport transactions

Read Address

Read Data

Write Data

Write Address

Write Response

Request from master Response to master

Request Packet Response Packet

packetize depacketize


Packetizing AXI to transport transactionsPacketizing AXI to transport transactions

Read Address

Read Data

Write Data

Write Address

Write Response

Request to slave Response to master

Request Packet Response Packet

depacketize packetize


SerializingSerializing

With a packetized protocol, serializing data simply requires a register and a mux.

Serializing packets is much easier than serializing the AXI interface protocol.


Throughput and wiresThroughput and wires

header

data

data

header

data

data

data

data

header

data

data

data

data

header

header

data

data

(a)

(b)

(c)

(d) Link width = data width + header widthHeader penalty = 0

Link width = header widthHeader penalty = 1 cycle per transaction

Link width < data widthHeader penalty > 1 cycle per transaction

Link width = data widthHeader penalty = 1 cycle per transaction


Selection of link widthSelection of link width

L2

DDR

peripherals

Place cores with high communication throughput and low latency requirements near each other. Use zero header penalty links between such cores.

Use narrow links for long paths to low throughput peripherals. This minimizes the number of long wires for P&R


Experimental packetized link widthExperimental packetized link width

Data width 32 64 128

AHB signals 113 177 305

AXI signals 204 272 408

Packets with 0 penalty cycles 146 218 362

Packets with 1 penalty cycle 84 156 300

results obtained with Arteris FlexNoC packetized interconnect generator

0

50

100

150

200

250

300

350

400

450

32 bit = 59 % 64 bit = 43 % 128 bit = 26 %

wire savings

wir

e co

un

t AXI

AHB

Arteris w/o header latency

Arteris w/ header latency


Experimental place & route resultsExperimental place & route results

Standard NoC


SummarySummary

Routing congestion is the problem of the decade for chip implementation.

AXI is expensive in wires.

Packetizing and serializing transaction data effectively reduces routing congestion.


Clairvoyant IPClairvoyant IP

physicalsynthesisphysicalsynthesis P&RP&RRTLRTL

serializing interconnect → fewer wiresserializing interconnect → fewer wires

physical synthesis → shorter wiresphysical synthesis → shorter wires


Technology

Arteris network on chip: The growing cost of wires