29
論論 NC2 1 論論論論論論論論論論論論論論論論論論 25 論論 論論 論論 論論論論論論論論論論論論 2(10:40-12: 10) 論論 (UEC ) [email protected]

1 25 ) [email protected]. 2 xE.ppt

Embed Size (px)

Citation preview

Page 1: 1 25 ) yosinaga@is.uec.ac.jp. 2   xE.ppt

NC論2 1

ネットワークコンピューティング論Ⅱ

平成 25 年度 後期火曜 第2時限(10:40-1

2:10)吉永 努(UEC )

[email protected]

Page 2: 1 25 ) yosinaga@is.uec.ac.jp. 2   xE.ppt

NC論2 2

内 容• 分散・並列処理計算機における相互結合ネッ

トワークとその上でのメッセージ・ルーティング技法などについて学ぶ

• 資料 http://comp.is.uec.ac.jp/yoshinagalab/yoshinaga/dp2.html

• http://ceng.usc.edu/smart/presentations/archives/AppendixE.ppt (253 slides, 13MB)

• http://booksite.mkp.com/9780123838728/references/appendix_f.pdf (P.118, 2MB)

• TA: 重信 裕政君  [email protected]    

Page 3: 1 25 ) yosinaga@is.uec.ac.jp. 2   xE.ppt

3

References• T. M. Pinkston and J. Duato: Interconnection

Networks, Appendix E in Computer Architecture: A Quantitative Approach, 4th Edition, Morgan Kaufmann publishers (2006).

• 5th Edition, Morgan Kaufmann publishers (2011).• J. Duato, S. Yalamanchili, L. Ni: Interconnection

Networks - an Engineering Approach-, 第 2版 , Morgan Kaufmann publishers (2003)

• 富田眞治: 並列コンピュータ、昭晃堂( 1996 )

• W.D. Dally, B. Towles: Principles and Practices of Interconnection Networks, Morgan Kaufmann publishers (2003)

Page 4: 1 25 ) yosinaga@is.uec.ac.jp. 2   xE.ppt

NC論2 4

What is an interconnection Network?

• It is a programmable system that transports data between terminals, such as processors and memory.

• It is programmable in the sense that it makes different connections at different points.

• It is a system because it is composed of many components: buffers, channels, switches, and controls that works together to deliver data.

Page 5: 1 25 ) yosinaga@is.uec.ac.jp. 2   xE.ppt

NC論2 5

Interconnection Network (1/2)

P

M

Interconnection Network

Multicomputer

P

M

P

M

Page 6: 1 25 ) yosinaga@is.uec.ac.jp. 2   xE.ppt

NC論2 6

Interconnection Network (2/2)

P

M

Interconnection Network

UMA type shared memory multiprocessor

It is also called dance-hall architecture.

P

M

P

M

Page 7: 1 25 ) yosinaga@is.uec.ac.jp. 2   xE.ppt

NC論2 7

Trend

• Its performance is increasing with processor performance at a rate of 50% per year.

• Communication is a limiting factor in the performance of many modern systems.

• Buses have been unable to keep up with the bandwidth demand, and point-to-point interconnection networks are rapidly taking over.

Page 8: 1 25 ) yosinaga@is.uec.ac.jp. 2   xE.ppt

NC論2 8

Computer Classifications (%) 2013/06 2012/06 2011/06

MPP 16.6 18.6 17.4

Cluster 83.4 81.4 82.2

Others 0.0 0.0 0.4

http://www.top500.org/

share of the TOP500 June, 2013 – June, 2011

Page 9: 1 25 ) yosinaga@is.uec.ac.jp. 2   xE.ppt

NC論2 9

Examples of clustersProcessors Accelerator Interconnect

Tianhe-2( 天河 2 号)

China2013

Intel Xeon E5-2692 12C

2.2 GHz×2 ×16K

Xeon Phi 31S1P (57 cores)×3

×16K

TH Express-2(proprietary)

Fat tree

Tsubame 2.5Tokyo Tech.

2013

Xeon X5670 2.93GHz×2

×1,408

NVIDIA Kepler K20x

×3×1,048

Infiniband QDR

(40Gbps) ×2Fat tree

Page 10: 1 25 ) yosinaga@is.uec.ac.jp. 2   xE.ppt

NC論2 10

Examples of MPPsNode Topology #core

RmaxK computer@RIKEN

Fujitsu2011

SPARC64 VIIIfx2 GHz

(16 GFlops×8 cores)

6D mesh/3D torus

Tofu interconnect

80K-node x 8-core= 640K-core10.51 PFlops

7,890 KW

Titan@ORNLCray XK7

2012

AMD Opteron 16C 2.2 GHz

+ NVIDIA K20x

3D torus Gemini

interconnect

18,688 nodes(200 Cabinets) 27.11 PFlops

8,209 KW

Page 11: 1 25 ) yosinaga@is.uec.ac.jp. 2   xE.ppt

NC論2 11

Other Networks of Supercomputers• Sequoia (2011): 5D torus, proprietary IBM SeaStar• Pleiades / NASA (2011): partial 11D hypercube

topology with IB QDR/DDR• Red Sky/ Sandia National Lab. (2010): 3D torus (12 bristled node) with IB QDR switches• IBM Roadrunner (2009): fat-tree with IB DDR• Earth Simulator2 / NEC SX-9E (2009): Fat-Tree (64GB/s/cpu, 8-CPU/node, 160 nodes)• IBM Blue Gene/L (2004): 3D torus proprietary (64 x 32 x 32 = 64K nodes)

Page 12: 1 25 ) yosinaga@is.uec.ac.jp. 2   xE.ppt

NC論2 12

Architecture vs. software

memory programming

UMA(SMP) shared OpenMP

NUMA(MPP)

distributed(not shared)

MPI(Message Passing Interface)

Page 13: 1 25 ) yosinaga@is.uec.ac.jp. 2   xE.ppt

NC論2 13

Network Design (1/3)

• Performance: latency and throughput (bandwidth)

• Scalability: #processors vs. network, memory, I/O bandwidth

• Incremental expandability: small to maximum size

• Partitionability: netwrok may be partitioned for several users

Page 14: 1 25 ) yosinaga@is.uec.ac.jp. 2   xE.ppt

NC論2 14

Network Design (2/3)

• Simplicity: simple design, higher clock frequency, easy to

use• Distance span: smaller system is preferred

for noise and cable delay, etc.• Physical constraints: packaging (pin count),

wiring(wire length), and maintenance (power consumption) should meet physical limitation.

Page 15: 1 25 ) yosinaga@is.uec.ac.jp. 2   xE.ppt

NC論2 15

Network Design (3/3)

• Reliability: fault tolerant, reliable communication, hot swap

• Expected workload: robust performance over a wade range of traffic

conditions.• Cost: trade-offs between cost and

performance.

Page 16: 1 25 ) yosinaga@is.uec.ac.jp. 2   xE.ppt

NC論2 16

Classifiction of Interconnection Networks

• Shared-Medium Networks– Local area networks (ethernet, token ring)– Backplane bus (e.g. SUN Gigaplane)

• Direct Networks (router-based)– mesh, torus, hypercube, tree, … etc.

• Indirect Networks (switch-based)• Hybrid Networks

Page 17: 1 25 ) yosinaga@is.uec.ac.jp. 2   xE.ppt

NC論2 17

Shared-Medium Networks (LAN)

• Arbitration that determines the mastership of the shared-medium network to resolve network access is needed.

• The most well-known protocol is carrier-sense multiple access with collision detection (CSMA/CD).

• Token bus and token ring pass a token from the owner which has the right to access the bus/ring and resolve nondeterministic waiting time.

Page 18: 1 25 ) yosinaga@is.uec.ac.jp. 2   xE.ppt

NC論2 18

Shared-Medium Networks (Backplane bus)

• It is commonly used to interconnect processor(s) and memory modules to provide SMP (Symmetrical Memory Processor) architecture.

• It is realized by printed lines on a circuit board by discrete wiring.

• Gigaplane in SUN Enterprise x000 server(1996): 2.6GB/s, 256 bits data, 42 bits address, 83.8MHz clock.

Page 19: 1 25 ) yosinaga@is.uec.ac.jp. 2   xE.ppt

NC論2 19

Direct (static) Networks

• Consists of a set of nodes.• Each node is directly connected to a subset

of other nodes in the network.• Examples:

– 2D mesh (intel Paragon), 3D mesh (MIT J-Mahine)– 2D torus (Fujitsu AP3000), 3D torus (Cray T3D, T3E)– Hypercube (CM1, CM2, nCUBE)

Page 20: 1 25 ) yosinaga@is.uec.ac.jp. 2   xE.ppt

NC論2 20

Mesh topology

2D 3D

node

Page 21: 1 25 ) yosinaga@is.uec.ac.jp. 2   xE.ppt

NC論2 21

Torus topology

2D

(4-ary 2-cube)3D

(3-ary 3-cube)

Page 22: 1 25 ) yosinaga@is.uec.ac.jp. 2   xE.ppt

NC論2 22

Hypercube (binary n-cube)

4D

(2-ary 4-cube)

Page 23: 1 25 ) yosinaga@is.uec.ac.jp. 2   xE.ppt

NC論2 23

tree

Binary tree fat tree x tree

Page 24: 1 25 ) yosinaga@is.uec.ac.jp. 2   xE.ppt

NC論2 24

Hierarchical topology (1/2)

Pyramid

(Hierarchical 2D mesh)Hierarchical ring

Page 25: 1 25 ) yosinaga@is.uec.ac.jp. 2   xE.ppt

NC論2 25

Hierarchical topology (2/2)

Cube-connected cycles RDT (Recursive Diagonal Torus)

Page 26: 1 25 ) yosinaga@is.uec.ac.jp. 2   xE.ppt

NC論2 26

Hypermesh (spaninng-bus hypercube)

Single or multiple buses

Page 27: 1 25 ) yosinaga@is.uec.ac.jp. 2   xE.ppt

NC論2 27

Base-m n-cube (hyper-crossbar)

Base-8 3-cube (Toshiba Prodigy)

000 007

070 077

707

777770

8x8 crossbar

Page 28: 1 25 ) yosinaga@is.uec.ac.jp. 2   xE.ppt

NC論2 28

Diameter and degrees (1/2)2D  mesh

2D torus

3D torus

binary n-cube

#node N N N N = 2n

Diameter 2√ N √N √N log N

degree 4 4 6 log N

Page 29: 1 25 ) yosinaga@is.uec.ac.jp. 2   xE.ppt

NC論2 29

Diameter and degrees (2/2)Base-m n-cube

CCC Binary tree

ring

#node N = mn

N = n2n

N N

Diameter logm N 3n/2 2log N N/2

degree logm N 3 3 2