Upload
keefe
View
50
Download
1
Tags:
Embed Size (px)
DESCRIPTION
Three-Dimensional Layout of On-Chip Tree-Based Networks. Hiroki Matsutani (Keio Univ, Japan) Michihiro Koibuchi (NII, Japan) D. Frank Hsu (Fordham Univ, USA) Hideharu Amano (Keio Univ, Japan). Outline. Introduction Network-on-Chip (NoC) 2-D vs. 3-D Fat Tree 2-D layout - PowerPoint PPT Presentation
Citation preview
Three-Dimensional Layout of On-Chip Tree-Based
Networks
Hiroki Matsutani (Keio Univ, Japan)Michihiro Koibuchi (NII, Japan)D. Frank Hsu (Fordham Univ, USA)Hideharu Amano (Keio Univ, Japan)
Outline• Introduction
– Network-on-Chip (NoC)– 2-D vs. 3-D
• Fat Tree– 2-D layout– 3-D layout
• Fat H-Tree– 2-D layout– 3-D layout
• Evaluations– Area, Wire length, Energy
[Matsutani, IPDPS’07]
Network-on-Chip (NoC)• Tile architectures
– MIT RAW
– Texas U. TRIPS
– Intel 80-tile NoC
• Various topologies– Mesh, Torus– Fat Trees– Fat H-Tree (FHT)
[Vangal, ISSCC’07]
[Buger, Computer’04]
[Taylor, Micro’02]
16-core Tile architecture
Tile (core & router)
Packet switched network on a chip
We proposed FHT as an alternative to Fat Trees[Matsutani, IPDPS’07]
2D Topologies: Mesh & Torus
Router Core
• 2-D Mesh • 2-D Torus– 2x bandwidth of
mesh
RAW [Taylor, IEEE Micro’02]
2D Topologies: Fat Tree
• Fat Tree (p, q, c)p: # of upward linksq: # of downward
linksc: # of core ports
Router Core
Fat Tree (2,4,2)Fat Tree (2,4,1)In this talk, we focus on 3-D layout scheme
of tree-based topologiesIn this talk, we focus on 3-D layout scheme
of tree-based topologies
Rank-1
Rank-2
2D NoC vs. 3D NoC• 2D NoCs
– Long wires (esp. trees)
– Wire delay– Packets consume
power at links according to their wire length
• 3D NoCs– Several small wafers
or dices are stacked
• Vertical link– Micro bump
– Through-wafer via
– Very short (10-50um)
[Ezaki, ISSCC’04]
[Burns, ISSCC’01]
Long horizontal wires in 2D NoCs can be replaced by very short vertical links in 3D
NoCs
Long horizontal wires in 2D NoCs can be replaced by very short vertical links in 3D
NoCs
Next slides show the 3D layout scheme of Fat Tree and FHT
Outline• Introduction
– Network-on-Chip (NoC)– 2-D vs. 3-D
• Fat Tree– 2-D layout– 3-D layout
• Fat H-Tree– 2-D layout– 3-D layout
• Evaluations– Area, Wire length, Energy
[Matsutani, IPDPS’07]
Fat Tree: 2-D layout
• Fat Tree (p, q, c)p: # of upward linksq: # of downward
linksc: # of core ports
Router Core
Fat Tree (2,4,2)Fat Tree (2,4,1)
We preliminarily show the 3D layout scheme of Fat Trees
Fat Tree: 3-D layout (4-split)
• 2-D coordinates • 3-D coordinates
Original 2-D layout
),( 22 DD YX ),,( 333 DDD ZYX
transformation
Dividing into 4 layers
Top-rank routers are distributed to each layer
Layer-0 Layer-1
Layer-2 Layer-3
Original 2-D layout
Fat Tree: 3-D layout (4-split)
Top-rank links are replaced with vertical
interconnects (10-50um)
• 2-D coordinates • 3-D coordinates),( 22 DD YX ),,( 333 DDD ZYX
transformation
3-D layout (4-stacked)This 3-D layout is evaluated in terms of area, wire, & energy
Layer-0
Outline• Introduction
– Network-on-Chip (NoC)– 2-D vs. 3-D
• Fat Tree– 2-D layout– 3-D layout
• Fat H-Tree– 2-D layout– 3-D layout
• Evaluations– Area, Wire length, Power
[Matsutani, IPDPS’07]
Fat H-Tree: Structure
• Fat H-Tree– Red Tree (H-Tree)– Black Tree (H-Tree)
[Matsutani, IPDPS’07]
Combining two H-Trees (red & black)
Router Core Router Core
Location of black tree is shifted lower-right direction of red tree
By shifting the location of black tree, the connection pattern of trees is different from the original Fat Trees
[Matsutani, IPDPS’07]
Fat H-Tree: Structure
• Fat H-Tree– Red Tree (H-Tree)– Black Tree (H-Tree)
Combining two H-Trees (red & black)
Router Core Router Core
Fat H-Tree is formed on red & black trees
[Matsutani, IPDPS’07]
Fat H-Tree: Structure
• Fat H-Tree– Red Tree (H-Tree)– Black Tree (H-Tree)
Combining two H-Trees (red & black)
Router Core Router Core
Fat H-Tree is formed on red & black trees
[Matsutani, IPDPS’07]
Fat H-Tree: Structure
• Fat H-Tree– Red Tree (H-Tree)– Black Tree (H-Tree)
Combining two H-Trees (red & black)
Router Core Router Core
Fat H-Tree is formed on red & black trees
[Matsutani, IPDPS’07]
Fat H-Tree: Structure
• Fat H-Tree– Red Tree (H-Tree)– Black Tree (H-Tree)
Combining two H-Trees (red & black)
Router Core Router Core
Rank-2 or upper routers are omitted in this figure
Each core is connected to
both red & black trees
Ring is formed with cores & rank1
routers
Torus-level performance by combing only two H-Trees
Fat H-Tree: 2-D layout on VLSI
• Fat H-Tree– Torus structure Folded as well as the folded layout of 2-D Torus
Fat H-Tree’s 2-D layoutRouter Core
Topologically equivalent
(Long feedback links across the chip)
[Matsutani, IPDPS’07]
The next slides propose the 3D layout scheme of Fat H-Tree
Fat H-Tree: 3-D layout (overview)
• Fat H-Tree– (Problem) Fat H-Tree has a torus structure– Folding so as to keep the torus structure
(step 1) fold it horizontally
(step 2) fold it vertically
consisting of red & black trees
Until the # of folded pieces meets the # of layers the 3-D IC has
E.g., four layers fold twice
Fat H-Tree: 3-D layout (overview)
• Fat H-Tree– (Problem) Fat H-Tree has a torus structure– Folding so as to keep the torus structure
consisting of red & black trees
(step 1) fold it horizontally
(step 2) fold it vertically
Until the # of folded pieces meets the # of layers the 3-D IC has
E.g., four layers fold twice
Fat H-Tree: 3-D layout (overview)
Here we show the 3D layouts of red & black trees separately
• Fat H-Tree– (Problem) Fat H-Tree has a torus structure– Folding so as to keep the torus structure
consisting of red & black trees
(step 1) fold it horizontally
(step 2) fold it vertically
Until the # of folded pieces meets the # of layers the 3-D IC has
E.g., four layers fold twice
Fat H-Tree: 3-D (Red tree; 4-split)
• 2-D coordinates • 3-D coordinates),( 22 DD YX ),,( 333 DDD ZYX
transformation
Original 2-D layout 3-D layout (4-stacked)
Layer-0 Layer-1
Layer-2 Layer-3
Fat H-Tree: 3-D (Red tree; 4-split)
• 2-D coordinates • 3-D coordinates),( 22 DD YX ),,( 333 DDD ZYX
transformation
Original 2-D layout 3-D layout (4-stacked)
Top-rank links are replaced with vertical
interconnects (10-50um)
Layer-0
Fat H-Tree: 3-D (Black tree;4-split)
Original 2-D layout 3-D layout (4-stacked)
• 2-D coordinates • 3-D coordinates),( 22 DD YX ),,( 333 DDD ZYX
transformation
Layer-0 Layer-1
Layer-2 Layer-3
They can be connected via only a vertical link
Fat H-Tree: 3-D (Black tree;4-split)
Original 2-D layout 3-D layout (4-stacked)
The periphery cores are connected to different layers
• 2-D coordinates • 3-D coordinates),( 22 DD YX ),,( 333 DDD ZYX
transformation
Fat H-Tree: 3-D (Black tree;4-split)
• 2-D coordinates • 3-D coordinates),( 22 DD YX ),,( 333 DDD ZYX
transformation
Original 2-D layout 3-D layout (4-stacked)
Top-rank links are replaced with vertical
interconnects (10-50um)
The periphery cores are connected to different layers
Layer-0
Fat H-Tree: 3-D layout (4-split)
Red tree (3-D)
Layer-0 Layer-0
Black tree (3-D) Fat H-Tree (3-D)
Layer-0
The 3-D layout of Fat H-Tree can be formed by superimposing 3-D layouts of red & black
trees
Outline• Introduction
– Network-on-Chip (NoC)– 2-D vs. 3-D
• Fat Tree– 2-D layout– 3-D layout
• Fat H-Tree– 2-D layout– 3-D layout
• Evaluations– Area, Wire length, Energy
[Matsutani, IPDPS’07]
Evaluations: 2-D vs. 3-D
• 2-D layout– 64-core
• 3-D layout– 16-core x 4-layer– Vertical
interconnects
L mm
L/2 mm
Network logic area: # of routers
N= N=16 N=64 N=256
FT1 6 28 120
FT2 12 56 240
FHT 10 42 170
3Dmesh 16 64 256
3Dtorus 16 64 256
# of routers & their ports in trees are less than mesh/torus
N
nn 22
• 3-D mesh/torus: node degree 7
• Fat H-Tree: node degree 5
• Fat Tree (2,4,2): node degree 6
2/)24( nn nn 24
3/)14(2 n
N
FT1: Fat tree(2,4,1) FT2: Fat tree(2,4,2) FHT: Fat H-Tree
Network logic area: 2-D vs. 3-D
[Davis, DToC’05]
• Wormhole router– 1-flit = 64-bit– 3-stage pipeline
• Network interface– FIFO buffer– Packet forwarding
(Fat H-Tree only)
• Inter-wafer via– 1-10um square– 100um per layer
per 1-bit signal
2
Inter-wafer via area is calculated according to # of vertical links
Inter-wafer via area is calculated according to # of vertical links
• Network logic area– Routers, NIs– Inter-wafer vias
Arbiter
5x5 XBAR
FIFO
FIFO
Typical wormhole router
Synthesized with a 90nm CMOS
[Matsutani, ASPDAC’08]
Network logic area: Overhead of 3D
Synthesis result of 64-core (16-core x 4)
FT1: Fat Tree(2,4,1) FT2: Fat Tree(2,4,2) FHT: Fat H-Tree3D layout of trees area overheat is modest (at most 7.8%)
3D torus
2D torus
Inter-wafer via area (+7.8%)
Total wire length of all links
• Total unit-length of links– Core router– Router router
1-unit link
1-unit link
How many unit-links is required ?
1-unit = distance between neighboring cores
Total wire length of all links
FT1: Fat Tree(2,4,1) FT2: Fat Tree(2,4,2) FHT: Fat H-Tree
N= N=16 N=64 N=256
2D FT1 32 192 1,024
2D FT2 64 384 2,048
2D FHT 72 392 1,800
2Dmesh 24 112 480
2Dtorus 48 224 960
nn 22
nN
nN2)
2
12(88
1
1
n
n
N
)2(2 nN
)2(4 nN
1-unit
Total wire length of all links N= N=16 N=64 N=256
2D FT1 32 192 1,024
2D FT2 64 384 2,048
2D FHT 72 392 1,800
2Dmesh 24 112 480
2Dtorus 48 224 960
nn 22 1-unitnN
nN2)
2
12(88
1
1
n
n
N
)2(2 nN
)2(4 nN
N= N=16 N=64 N=256
3D FT1 16 128 768
3D FT2 32 256 1,536
3D FHT 40 200 904
3Dmesh 16 96 448
3Dtorus 32 192 896
)2
12(48
1
1
n
n
N
Nn )1(
Nn )1(2
)2(4 1 nN
)2(2 1 nN
1-unit4-stacked
FT1: Fat Tree(2,4,1) FT2: Fat Tree(2,4,2) FHT: Fat H-TreeWire length of trees is reduced by 25%-50% (close to torus)
nn 22
Energy: NoC’s energy model
• Ave. flit energy– Send 1-flit to dest.– How much
energy[J] ?
• Parameters– 8mm square chip– 64-core (16-core x 4)– 90nm CMOS
• Switching energy– 1-bit switching @
Router– Gate-level sim– 0.183 [pJ / hop]
• Link energy– 1-bit transfer @ Link– 0.150 [pJ / mm]
• Via energy– 4.34 [fF / via]
flitE
swE
linkE
)( linkswaveflit EEHwE
8mm
[Davis, DToC’05]
Energy: Reduction by going 3D
Frequent use of longest
links
Short hop count less
energy
FT1: Fat tree(2,4,1) FT2: Fat tree(2,4,2) FHT: Fat H-Tree
2-D layout
Energy: Reduction by going 3D
2-D layout 3-D layout
Moving distance of packets is reduced
The 3D layout of trees reduces the energy by 30.8%-42.9%
FT1: Fat tree(2,4,1) FT2: Fat tree(2,4,2) FHT: Fat H-Tree
Summary: 3-D layout of trees
• Drawbacks of on-chip tree-based topologies– Long links around the root of tree– Wire delay problem– Repeater insertion additional energy
consumption
• 3-D layout schemes of Fat Trees & Fat H-Tree– Wire length is reduced by 25%-50%– Area overhead is at most 7.8%– Flit transmission energy is reduced by 30.8%-42.9%
Need to consider negative impacts of 3-D (cost,heat,yield…)
In addition, energy-hungry repeater buffers can be removed
Thank you for your attention
Backup slides
Energy: Reduction by going 3D
2-D layout (w/o repeaters)
2-D layout (with repeaters)
(*) Repeater insertion model:
N. Weste et.al, “CMOS VLSI Design (3rd ed)”, 2005.
(*)
Energy is increased
FT1: Fat tree(2,4,1) FT2: Fat tree(2,4,2) FHT: Fat H-Tree