Upload
liesl-weyl
View
104
Download
0
Tags:
Embed Size (px)
Citation preview
The Future of Parallel Computing
The Future of Parallel Computing
Special Purpose MeshArchitectures
Heiko Schröder, 1998
P A R C
SA ISA PIPS RM OH
Heiko Schröder, 1998 Slide 2
P A R C ContentsContents
•Why meshes ???
•Application specific parallel mesh architectures
Fine grain1983
Coarse grain1997
-Systolic Arrays-Instruction Systolic Arrays-PIPS-Reconfigurable mesh-Optical Highway
Heiko Schröder, 1998 Slide 3
P A R C Physical limitsPhysical limits
c=300 000 km/sec • OPS -- 0.3 mm/OP
• 1000 PEs with OPS --30cm/OP
• massive parallelism
• distributed memory
910
1210
Heiko Schröder, 1998 Slide 4
P A R C
0.01
0.1
1
10
100
1000
1970 1975 1980 1985 1990 1995
Per
form
an
ce
(MIP
S)
Year
4004
808080286
80386
Pentium 2
Pentium
80486
Processor powerProcessor power
Heiko Schröder, 1998 Slide 5
P A R C
0,5 µ
0,25 µ
ScalingFaktor 2:
• 1/2 width • 1/2 hight • 1/2 switching time
8 x performance!
Heiko Schröder, 1998 Slide 6
P A R C CMOS transistorsCMOS transistors
1960 1970 1980 1990 2000 2010 2020 2030
0,01
0,1
1
10
Size of minimal transistor
ca. 0,03
Heiko Schröder, 1998 Slide 7
P A R C Mesh/TorusMesh/Torus
diameter bisection width
nn
2D mesh
Heiko Schröder, 1998 Slide 8
P A R C HypercubeHypercube
0-D0
11-D
00
01
10
112-D
000 010
001 011
100 110
101 111
3-D
0 1
4-D
diameter log nbisection width n
Heiko Schröder, 1998 Slide 9
P A R C VLSIVLSI
Very
Large
Scale
Integration
• simple cells
• few types
• regular architecture
• short connections
mesh -- torus
Heiko Schröder, 1998 Slide 10
P A R C Pin limitationsPin limitations
16x16 pins
diameter 256
16 pins
diameter 16
16x12 pins
Heiko Schröder, 1998 Slide 11
P A R C Bisection widthBisection width
Bisection width 256
25 cm
Bisection width 32K
32 m
Heiko Schröder, 1998 Slide 12
P A R C ProgrammingProgramming
• SA --- Systolic Array
• SIMD --- Single Instruction Multiple Data
• ISA --- Instruction Systolic Array
• MIMD --- Multiple Instruction Multiple Data
Heiko Schröder, 1998 Slide 13
P A R C parallel mergeparallel merge
initial situation:
1.) sort columns
(odd-even-transposition sort)
2.) sort rows
(odd-even-transposition sort)
sorted !!!!
x1 x2 x3 x4 x5 x6
x7
x17 x18
y1 y2 y3 y4 y5 y6
y7
y17 y18
...
...
...
...
Heiko Schröder, 1998 Slide 14
P A R C 0-1 principle0-1 principle
• The 0-1 principle states that if all sequences of 0 and 1 are sorted properly than this is a correct sorter.
• The sorter must be based on moving data.
initially
0s
0s
1s
after verticalsort
0s
1s
after horizontalsort
0s
1s
Heiko Schröder, 1998 Slide 15
P A R C
MIMD-mesh (clocked)MIMD-mesh (clocked)
min max Time: 2n
Heiko Schröder, 1998 Slide 16
P A R C systolic mergesystolic merge
1 3 3 45 5 6 79 8 8 74 4 3 2
Heiko Schröder, 1998 Slide 17
P A R C systolic mergesystolic merge
1 3 3 45 5 6 79 8 8 74 4 3 2
Heiko Schröder, 1998 Slide 18
P A R C systolic mergesystolic merge
1 3 3 45 5 6 79 8 8 74 4 3 2
Heiko Schröder, 1998 Slide 19
P A R C systolic mergesystolic merge
1 3 3 45 5 6 79 8 8 74 4 3 2
Heiko Schröder, 1998 Slide 20
P A R C systolic mergesystolic merge
1 3 3 45 5 6 79 8 8 74 4 3 2
1 3 3 45 5 6 74 4 3 29 8 8 7
Heiko Schröder, 1998 Slide 21
P A R C systolic mergesystolic merge
1 3 3 45 5 6 74 4 3 29 8 8 7
Heiko Schröder, 1998 Slide 22
P A R C systolic mergesystolic merge
1 3 3 44 4 3 25 5 6 79 8 8 7
Heiko Schröder, 1998 Slide 23
P A R C systolic mergesystolic merge
1 3 3 44 4 3 25 5 6 79 8 8 7
Heiko Schröder, 1998 Slide 24
P A R C systolic mergesystolic merge
1 3 3 24 4 3 45 5 6 79 8 8 7
Heiko Schröder, 1998 Slide 25
P A R C systolic mergesystolic merge
1 3 3 24 4 3 45 5 6 79 8 8 7
Heiko Schröder, 1998 Slide 26
P A R C systolic mergesystolic merge
1 3 3 24 4 3 45 5 6 79 8 8 7
Heiko Schröder, 1998 Slide 27
P A R C systolic mergesystolic merge
1 3 2 34 3 4 45 5 6 79 8 8 7
Heiko Schröder, 1998 Slide 28
P A R C
1 3 2 34 3 4 45 5 6 79 8 8 7
systolic mergesystolic merge
Heiko Schröder, 1998 Slide 29
P A R C
1 2 3 33 4 4 45 5 6 79 8 8 7
systolic mergesystolic merge
Heiko Schröder, 1998 Slide 30
P A R C
1 2 3 33 4 4 45 5 6 79 8 8 7
systolic mergesystolic merge
Heiko Schröder, 1998 Slide 31
P A R C1 2 3 33 4 4 45 5 6 79 8 8 7
systolic mergesystolic merge
Heiko Schröder, 1998 Slide 32
P A R C1 2 3 33 4 4 45 5 6 78 9 7 8
systolic mergesystolic merge
Heiko Schröder, 1998 Slide 33
P A R C1 2 3 33 4 4 45 5 6 78 9 7 8
systolic mergesystolic merge
Heiko Schröder, 1998 Slide 34
P A R C1 2 3 33 4 4 45 5 6 78 7 9 8
systolic mergesystolic merge
Heiko Schröder, 1998 Slide 35
P A R C systolic mergesystolic merge1 2 3 33 4 4 45 5 6 78 7 9 8
Heiko Schröder, 1998 Slide 36
P A R C• sorted !!!
systolic mergesystolic merge1 2 3 33 4 4 45 5 6 77 8 8 9
Heiko Schröder, 1998 Slide 37
P A R C Characteristics of SAsCharacteristics of SAs
Extremely high cost-performanceno flexibility -- long development time
Suitable for special signal processing tasks ???
Heiko Schröder, 1998 Slide 38
P A R C Systolic architectures ISystolic architectures I
1. Lang, H.W., Schimmler, M., Schmeck, H., Schröder, H., “A Fast Sorting Algorithmfor VLSI”, Proc. 10th ICALP, Barcelona, July 1983, Lecture Notes in ComputerScience, 154, pp408--419, 1983
2. Lang, H.W., Schimmler, H., Schröder, H., “Pattern Matching in Binary Trees on aMesh-Connected Processor Array”, VLSI: Algorithms and Architectures, Bertolazziand Luccio (eds.), North Holland, pp113--124, 1984
3. Schmeck, H., Schröder, H., “Dictionary Machines for Different Models of VLSI”,IEEE Transactions on Computers, C-34, pp472--475, 1985
4. Lang, H.W., Schimmler, M., Schmeck, H., Schröder, H., “Realistic Comparisons ofSorting Algorithms for VLSI”, Foundations of Data Organisation, Ghosh,Kambayashi and Tanaka (eds.), Plenum Press, pp309--318, 1987
5. Schröder, H., “VLSI-Sorting Evaluated under the Linear Model”, Journal ofComplexity 4, pp 330-355, December 1988
6. Schmeck, H., Schröder, H., Starke, C., “Systolic s2-Way Merge Sort is NearlyOptimal”, in IEEE Transactions on Computers, Vol 38, Nr 7, pp 1052-1056, 1989
7. Murthy, V.K., Schröder, H., “Systolic Arrays for Parallel G-Inversion and FindingPetri Net Invariants”, in Parallel Computing, Vol. 11, Nr. 3, pp 349-359, 1989
8. E.V. Krishnamurthy, M. Kunde, M. Schimmler, H. Schröder, “Systolic algorithm fortensor products of matrices --- implementation and applications”, Parallel Computing13, 301-308, 1989
Heiko Schröder, 1998 Slide 39
P A R C Systolic architectures IISystolic architectures II
9. Schimmler, M., Schröder, H., “A Simple Systolic Method to Find all Bridges onan undirected Graph”, Parallel Computing 12, 107-111, 1989
10. P. Lenders, H. Schröder, “A programmable systolic device for Image Processing --- based on Mathematical Morphology”, Parallel Computing, 13, pp 337-344,1990
11. Schröder, H., Krishnamurthy, E.V., “Systolic Algorithms for PolynomialInterpolation and Related Problems”, Parallel Computing, 17, pp 493-504, 1991
12. Schröder, H., Krishnamurthy, E.V., “Systolic Algorithms for MultivariateApproximation Using Tensor Products of Basis Functions”, Parallel Computing,17, pp 483-492, 1991
13. B. Beresford-Smith, J. Breckling, H. Schröder, “Systolic Codebook Generation”,Transactions on Acoustics, Speech, & Signal Processing, pp 144-149, Vol.1, Nr.2, April 1993
14. Schröder, H., “Partition Sorts for VLSI”, Proceedings GI-13, Jahrestagung,Hamburg, October 1983, Informatik-Fachberichte 73, pp101--116, 1983
15. Schröder, H., “VLSI-gerechte Sortierverfahren”, PARS-Workshop, Erlangen,April 1984, Mitteilungen GI-PARS Nr. 2, pp134--143, 1984
16. Schröder, H. “VLSI-Sorting under the Linear Model”, Proceedings of the TenthAustralian Computer Science Conference, pp330--340, Deakin University,February 1987
17. B. Beresford-Smith, J. Breckling, H. Schröder, “Systolic Devices for SpeechProcessing”, CompEuro 89, Hamburg, Germany, May 1989
Heiko Schröder, 1998 Slide 40
P A R C ISA mergeISA merge
1 3 3 45 5 6 79 8 8 74 4 3 2
C:=min{C, CE}
C:=max{C, CW}
Heiko Schröder, 1998 Slide 41
P A R C ISA mergeISA merge
1 3 3 45 5 6 79 8 8 74 4 3 2
Heiko Schröder, 1998 Slide 42
P A R C ISA mergeISA merge
1 3 3 45 5 6 79 8 8 74 4 3 2
Heiko Schröder, 1998 Slide 43
P A R C ISA mergeISA merge
1 3 3 45 5 6 79 8 8 74 4 3 2
Heiko Schröder, 1998 Slide 44
P A R C ISA mergeISA merge
1 3 3 45 5 6 79 8 8 74 4 3 2
Heiko Schröder, 1998 Slide 45
P A R C ISA mergeISA merge
1 3 3 45 5 6 74 8 8 79 4 3 2
Heiko Schröder, 1998 Slide 46
P A R C ISA mergeISA merge
1 3 3 45 5 6 74 8 8 79 4 3 2
Heiko Schröder, 1998 Slide 47
P A R C ISA mergeISA merge
1 3 3 44 5 6 75 4 8 79 8 3 2
Heiko Schröder, 1998 Slide 48
P A R C ISA mergeISA merge
1 3 3 44 5 6 75 4 8 79 8 3 2
Heiko Schröder, 1998 Slide 49
P A R C ISA mergeISA merge
1 3 3 44 4 6 75 5 3 79 8 8 2
Heiko Schröder, 1998 Slide 50
P A R C ISA mergeISA merge
1 3 3 44 4 6 75 5 3 79 8 8 2
Heiko Schröder, 1998 Slide 51
P A R C ISA mergeISA merge
1 3 3 44 4 3 75 5 6 29 8 8 7
Heiko Schröder, 1998 Slide 52
P A R C ISA mergeISA merge
1 3 3 44 4 3 75 5 6 29 8 8 7
Heiko Schröder, 1998 Slide 53
P A R C ISA mergeISA merge
1 3 3 44 4 3 25 5 6 79 8 8 7
Heiko Schröder, 1998 Slide 54
P A R C ISA mergeISA merge
1 3 3 44 4 3 25 5 6 79 8 8 7
Heiko Schröder, 1998 Slide 55
P A R C ISA mergeISA merge
1 3 3 24 4 3 45 5 6 79 8 8 7
Heiko Schröder, 1998 Slide 56
P A R C ISA mergeISA merge
1 3 3 24 4 3 45 5 6 79 8 8 7
Heiko Schröder, 1998 Slide 57
P A R C ISA mergeISA merge
1 3 2 34 3 4 45 5 6 79 8 8 7
Heiko Schröder, 1998 Slide 58
P A R C ISA mergeISA merge
1 3 2 34 3 4 45 5 6 79 8 8 7
Heiko Schröder, 1998 Slide 59
P A R C ISA mergeISA merge
1 2 3 33 4 4 45 5 6 79 8 8 7
Heiko Schröder, 1998 Slide 60
P A R C ISA mergeISA merge
1 2 3 33 4 4 45 5 6 79 8 8 7
Heiko Schröder, 1998 Slide 61
P A R C ISA mergeISA merge
1 2 3 33 4 4 45 5 6 79 8 8 7
Heiko Schröder, 1998 Slide 62
P A R C ISA mergeISA merge
1 2 3 33 4 4 45 5 6 78 9 7 8
Heiko Schröder, 1998 Slide 63
P A R C ISA mergeISA merge
1 2 3 33 4 4 45 5 6 78 9 7 8
Heiko Schröder, 1998 Slide 64
P A R C ISA mergeISA merge
1 2 3 33 4 4 45 5 6 78 7 9 8
Heiko Schröder, 1998 Slide 65
P A R C ISA mergeISA merge
1 2 3 33 4 4 45 5 6 78 7 9 8
Heiko Schröder, 1998 Slide 66
P A R C ISA mergeISA merge
1 2 3 33 4 4 45 5 6 77 8 8 9
Heiko Schröder, 1998 Slide 67
P A R C ISA mergeISA merge
1 2 3 33 4 4 45 5 6 77 8 8 9
Heiko Schröder, 1998 Slide 68
P A R C ISA mergeISA merge
1 2 3 33 4 4 45 5 6 77 8 8 9
Heiko Schröder, 1998 Slide 69
P A R C ISA mergeISA merge
1 2 3 33 4 4 45 5 6 77 8 8 9
Heiko Schröder, 1998 Slide 70
P A R C ISA mergeISA merge
1 2 3 33 4 4 45 5 6 77 8 8 9
Heiko Schröder, 1998 Slide 71
P A R C Hough transform on the ISAHough transform on the ISA
• good line detection method
shear
Fast tomography
Heiko Schröder, 1998 Slide 72
P A R C robot visionrobot vision
projectorCCD CCD
• stereo vision
Heiko Schröder, 1998 Slide 73
P A R C Use of the ISAUse of the ISA
Areas of application for ISA:automatic optical quality control
real time signal processingcomputer graphics /visualizationlinear equationsCryptography --> Tele-medicine ?
Special features:fast aggregate functions (sum, carry)fast local communicationno local memorytypical improvement over PC: Factor 20-30
Heiko Schröder, 1998 Slide 74
P A R C Instruction Systolic ArrayInstruction Systolic Array
1. Schröder, H., “The Instruction Systolic Array --- A Tradeoff between Flexibility andSpeed”, Computer Systems Science and Engineering, Vol 3 No 2, April 1988
2. Schröder, H., “Top-Down Designs of Instruction Systolic Arrays for PolynomialInterpolation and Evaluation”, in Journal of Parallel and Distributed Computing, Vol.6, pp 692-703, 1989
3. Kunde, M., Lang, H.W., Schimmler, M., Schmeck, H., Schröder, H., “The InstructionSystolic Array and its Relation to other Models of Parallel Computers”, ParallelComputing, Vol 7, pp 25-39, 1988
4. Schröder, H., Krishnamurthy, E.V., “Instruction Systolic Array Computation of theCharacteristic Polynomial of a Hessenberg Matrix”, Parallel Computing, 17, pp 273-278, 1991
5. H. Schröder, P. Strazdins, “Programm compression on the ISA”, Parallel Computing17, 207-219, 1991
6. B. Pham, H. Schröder, “An Instruction Systolic Device for Quadratic SurfaceGeneration”, CompEuro 89, Hamburg, West Germany, May 1989
7. P. Lenders, H. Schröder, P. Strazdins, “Microprogramming Instruction SystolicArrays”, MICRO 22, Dublin, August 1989
8. B. Schmidt, M. Schimmler, H. Schröder, “Morphological Hough Transform on theInstruction Systolic Array”, Euro-Par ’97 – Parallel Computing, LNCS 1300, SpringerVerlag, pp. 798-806, 1997.
Heiko Schröder, 1998 Slide 75
P A R C PIPS (1990-94)PIPS (1990-94)
1 M bit
1 M bit
32x32 torus16 bit parallelcommunication16 bit addprefetch
memory control
BHP -- CSIRO -- NU -- ADFA 1.4 M
Heiko Schröder, 1998 Slide 76
P A R C
Special features:local memorySIMD-torusmemory pre-fetch
Applications:visualization3D-simulation (CFD, FEM)
Heiko Schröder, 1998 Slide 77
P A R C
Heiko Schröder, 1998 Slide 78
P A R C PIPSPIPS
1. A. Spray, K.T. Lie, H. Schröder, “Test Strategies Employed in a Massively ParallelVisualization Engine”, PRFTS, Melbourne, December 1993
2. A. Spray, H. Schröder, K.T. Lie, “A Low-Cost Machine for Real-TimeVisualization”, 5th International Symposium on IC Technology, Singapore, 1993
3. H. Schröder et al, “PIPADS: A Vertically Integrated Parallel Image Processing andDisplay System”, 5th Australian Supercomputing Conference, Melbourne, 1992.
4. R.Lang, E.Plesner, H.Schröder, A.Spray, “An efficient systolic architecture for theone-dimensional wavelet transform”, in Wavelet Applications, Harold H. Szu,Editor, Proc. SPIE 2242, p925-935, 1994.
5. R. Lang, A. Spray, H. Schröder, “2D wavelet transform on a SIMD torus ofscanline processors”, Australian Computer Science Communications, Vol 17, Nr17, 271-277, 1995.
6. P. Bray, S.W. Chan, K.T.Lie, Meiyun, H. Schröder, A. Spray, “A torus for scan-line based image processing”, Presented at the Post-ISCA Special PurposeArchitectures Workshop, May 1992.
7. A. Spray, H. Schröder, K.T. Lie, E. Plesner, P. Bray, “PIPADS --- A Low-CostReal-Time Visualization Tool”, Supercomputing, Melbourne, 1993.
8. H. Schröder, A. Spray, “PIPS a massively parallel image processing system”,Workshop PARAGRAPH'94, Linz, March 1994.
9. H. Schröder, “3D-Visualisation based on Scan-line Image Processing”, invitedspeaker, Workshop MWTAI’97 in Missen-Wilhams, March 1997
Heiko Schröder, 1998 Slide 79
P A R C Use in industry ?Use in industry ?
1993 1994 1995 1996
500
1000
1500
2000
2500
3000
Performance[Gflops]
Research
Industry
1327
2121
648
3675
126248
693
1168
Heiko Schröder, 1998 Slide 80
P A R C InvestmentsInvestments
Investments into parallel computers[M$]
0
500
1000
1500
2000
2500
3000
3500
1993 1994 1995 1996
Research
Industry
Heiko Schröder, 1998 Slide 81
P A R C ConcentrationConcentration
1993 1994 1995 1996
10
20
30
40
50
60
Number ofmanufacturers
11
2119
49
Heiko Schröder, 1998 Slide 82
P A R C Degree of ParallelismDegree of Parallelism
0
50
100
150
200
250
300
350
400
450
May
-93
Nov-93
May
-94
Nov-94
May
-95
Nov-95
May
-96
Nov-96
May
-97
Nov-97
1 to 63
64 to 255
256 to 1023
1024 and more
Number of new Systems
Heiko Schröder, 1998 Slide 83
P A R C EvaluationEvaluation
Cost * computation time
• Parallel computers with standard components
• Imbedded parallel systems
Heiko Schröder, 1998 Slide 84
P A R C reconfigurable meshreconfigurable mesh
reconfigurable mesh =mesh + interior connections
15 positions
low cost
Heiko Schröder, 1998 Slide 85
P A R C global OR and modulo 3global OR and modulo 3
1 0 000 1 0
* * “V”
log n on EREW-PRAM10 11 10
*1 mod 3
log n / log log n on CRCW-PRAM
Heiko Schröder, 1998 Slide 86
P A R C sorting with all-to-all mappingsorting with all-to-all mapping
Sorting:sort blocksall-to-all (columns)sort blocks all-to-all (rows)o-e-sort blocks
Heiko Schröder, 1998 Slide 87
P A R C all-to-all mappingall-to-all mapping
n2
3
n x n
Heiko Schröder, 1998 Slide 88
P A R C vertical all-to-allvertical all-to-all
Heiko Schröder, 1998 Slide 89
P A R C horizontal all-to-allhorizontal all-to-all
Heiko Schröder, 1998 Slide 90
P A R C
1 step
2 steps
3 steps
k/2 steps
3 steps
2 steps
1 step
(k/2)2 steps
Heiko Schröder, 1998 Slide 91
P A R C sorting in optimal timesorting in optimal time
(k/2)2 stepsk=n1/3
each step takes n1/3 time --> T= n/4
x 2
x 2
/2
T = n/2all-to-all
Sorting:sort blocks (O(n2/3))all-to-all (n/2)sort blocks (O(n2/3))all-to-all (n/2)sort blocks (O(n2/3))
time: n + o(n)
Heiko Schröder, 1998 Slide 92
P A R C Reconfigurable meshReconfigurable mesh
Special featuresSIMDconstant diameterfaster than PRAM ?Suitable applicationsrouting/sorting/load balancingsparse matrix multiplicationsegmentation / component labelingfeature extractionimage database ?
Heiko Schröder, 1998 Slide 93
P A R C Reconfigurable meshReconfigurable mesh
1. Kapoor, H. Schröder, B. Beresford-Smith, “Connected Component Labelling on3D Reconfigurable Mesh Architecture”, Parallel Processing, 1994
2. M. Kaufmann, H. Schröder, J. Sibeyn, “Routing and Sorting on ReconfigurableMeshes”, Parallel Processing Letters, Vol 5, Nr 1, pp 81-96, 1995
3. G. Turner, H. Schröder, “Token Distribution and Load Balancing on recon-figurable, d-D. Meshes”, Journal of Parallel Architectures and Algorithms, 1996
4. M. Middendorf, H. Schmeck, H. Schröder, G. Turner. “Multiplikation of matriceswith different sparseness properties on dynamically reconfigurable meshes”, toappear in VLSI Design.
5. Kapoor A., Schröder H., Beresford-Smith B., “Constant Time sorting on aReconfigurable Mesh “, Australian Computer Science Conference, February 1993.
6. D. Yu, H. Schröder, “Parallel Border following, Connecting and Labeling forBinary Image with 8-neighborhood Coding on a Mesh with Bi-reconfigurableBuses”, 3rd International Workshop on Parallel Image Analysis, Maryland, 1994.
7. Kapoor, H. Schröder, B. Beresford-Smith, “Deterministic Permutation Routing onthe Reconfigurable Mesh”, Proceedings 8th IPPS, pp 536-540, Mexico, 1994.
8. G. Turner, H. Schröder, “Fast Token Distribution on Reconfigurable Meshes”, inProceedings of the 2nd Australasian Conference on Parallel and Real-TimeSystems (PART)}, vol 2, pp 127-133, 1995
9. H. Schmeck, H. Schröder, G. Turner, “Efficient Sparse Matrix Multiplication on aReconfigurable Mesh”, PARS-Workshop, Stuttgart, Germany, October 1995.
10. H. Schröder, “Mathematical Morphology for Robot Vision on a ReconfigurableMesh Architecture”, ISCA Special Purpose Architectures Workshop, May 1992.
11. M. Kaufmann, H. Schröder, J. Sibeyn, “Asymptotically Optimal and PracticalRouting on the Reconfigurable Mesh”, GI/ITG-Workshop “Architekturen fürhochintegrierte Schaltungen”, Schloß Dagstuhl, July1994.
Heiko Schröder, 1998 Slide 94
P A R C Optical HighwayOptical Highway
6
3
10
#
C
widthW
processorsP
CWP W=1; P=100W=100; P=22
C
All-to-all connection
Heiko Schröder, 1998 Slide 95
P A R C
Horizontal all-to-all
Verticalall-to-all
Heiko Schröder, 1998 Slide 96
P A R C
Features of optically connected meshesSIMD/SPMD/MIMDimplement all major architecturesall-to-all communication in 2 stepsBulk synchronous processing (BSP)no latency hidingno pin-limitationApplicationscoarse grain parallel computing only?ray-tracing ????
Heiko Schröder, 1998 Slide 97
P A R C Optical HighwayOptical Highway
1. H. Schröder et al, “RMB --- A Reconfigurable Multiple Bus Network”, HPCA 96, San Jose, 19962. H. Schröder, O. Sykora, I. Vrto, “Optical All-to-All Communication for some Product Graphs”, SOFSEM '97, Milovy, Czech Republic, 1997
Heiko Schröder, 1998 Slide 98
P A R C Bisection-width / DiameterBisection-width / Diameter
Diameter log nbisection width n
diameter bisection width
nn
SAISAPIPS
Diameter 1 bisection width n
RM
Diameter 1 bisection width nOH
Heiko Schröder, 1998 Slide 99
P A R C Suitable problems ?Suitable problems ?
diameter log nbisection width n
SA: suitable applications?ISA: 2D-problems, aggregate functions local communicationPIPS: 3D-problems, local communication
RM: diameter-bound > bisection-width-bound
OH: PRAM equivalent?
SAISAPIPS
RM
OH
??
??