View
223
Download
0
Category
Tags:
Preview:
Citation preview
1
Trends and options in parallel computing
© Heiko Schröder, 2003
2
need0.01
0.1
1
10
100
1000
1970 1975 1980 1985 1990 1995
Per
form
an
ce
(MIP
S)
Year
4004
8080 80286
80386
Pentium 2
Pentium
80486
trends
massively parallel computing
1024
10241024
1024
1024
1024
hybridcomputing
10 11 10
reconfigurable mesh
optical highway
limitations
3
Understand the “world”
Create a model
Simulate
4
low cost!
5
1010 Light years (1026 mtrs) away1010 Light years (1026 mtrs) away
• This is the limit of the known universe.
We can either see nothing beyond this or there is nothing beyond this.
6
100k Light years (1021 mtrs)100k Light years (1021 mtrs)
• Moving out from the plane of our own spiral galaxy, the Milky Way, we again encounter nothing.
• Any alien life existing at this distance would be unaware that, lost in the light arriving from Sol - our indistinguishable sun, we were witnessing the dawn of Homo-Sapiens.
7
1 Peta m (1015 mtrs)1 Peta m (1015 mtrs)
• The Sun (Sol) is now just another star in space. This distance represents the farthest reaches of the Comets of our Solar System.
• Comets have extremely eccentric orbits.
• Perhaps the most famous comet, Halley’s, has been continuously observed since records in 240BC, returning to the Sun and passing through the vicinity of the Earth’s orbit every 74 to 78 years.
8
1 Gm (109 mtrs) 1 Gm (109 mtrs)
• Beyond the Moon’s orbit and the Earth is now just a speck in a dark sky.
• The orbit of the Moon around the Earth is, of course,not visible but has been added to show its relative size.
• Automatic image analysis might find the meteor that is likely to hit our planet.
9
100,000 km (108 mtrs) 100,000 km (108 mtrs)
• The Earth from the Moon as first witnessed by the crew of Apollo 8 (Lovell, Borman and Anders) on appearing from the far side of the moon on their first lunar orbit, Dec 24 1968.
• Borman was so inspired by this view of “the Good Earth” from space that he read a sermon, to listening world back on earth, on Christmas Day.
10
10,000 km (107 mtrs) 10,000 km (107 mtrs)
• A satellite view of the continent of North America, showing Earth’s turbulent weather patterns and the prominent hurricane off the West Indies.
• Weather satellites on near, polar orbits provide regular coverage of the Earth,s weather system.
• Communications satellites in distant, geo-stationary orbits, provide a web of continuous cross-continental communications.
11
1,000 km (106 mtrs) 1,000 km (106 mtrs)
• Chicago,its environs and Lake Michegan, as pictured by the astronauts of Skylab, launched May 14, 1973.
• Apart from the the military use of space, global monitoring and reconnaissance have become a vital part of the technological age.
• Using the differing optical properties of water and vegetation, Earth’s vital resources can be surveyed from space.
12
100 km (105 mtrs) 100 km (105 mtrs)
• Metropolitan Chicago and Lake Michegan, an area covering 10,000 square kilometres, is visible only from extreme heights.
• Some Balloons and Military spy planes are capable of flying at such altitudes.
• Environmental studies based on satellite images, using supercomputing for simulation.
13
10 km (104 mtrs) 10 km (104 mtrs)
• From a passing aircraft, at 33,000ft, whole cities come into view.
• Downtown Chicago.
• The Lake Michigan the waterfront, piers and the city streets and clearly visible.
14
1 km (103 mtrs) 1 km (103 mtrs)
• 1 kilometre spans the football stadium and the marina.
• Current limits of remote sensing from satellites.
15
100 m (102 mtrs) 100 m (102 mtrs)
• A small recreational area.
• From a helicopter hovering over the picnic site, we span an area roughly the size of a running track.
16
10 m (101 mtrs) 10 m (101 mtrs)
• The Picnic Site.
• Visually, we are now moving away from our subject, in order that our field of view can span a region 10m by 10m.
• The field of view of our eyes is about 50o, although this really defines our sharp vision. Evolution has equipped our eyes to spot movement right out to the periphery of our vision, almost 180o.
17
1 metre1 metre
• One Meter is the Order of the average Human Torso.
• Click in the central box to zoom in.
• Click in the picture to zoom out.
18
10cm = 100 mm (10-1 mtrs) 10cm = 100 mm (10-1 mtrs)
• The width of an adult human hand.
19
1 cm = 10 mm (10-2 mtrs) 1 cm = 10 mm (10-2 mtrs)
• The reticulated pattern of our skin.
20
1 mm (10-3 mtrs) 1 mm (10-3 mtrs)
• The cell structure and folds of our skin.
• This is the view that we would get is we used a magnifying glass of power x10.
21
100 um (10-4 mtrs) 100 um (10-4 mtrs)
• The semi-translucent cells of our own skin, as seen through a microscope at 100x.
• At 1/10th of a millimeter (100microns), it is beyond the limit of human acuity (the resolving power of our eyes).
• At best we can resolve about 15 lines per mm at a distance of one meter.
22
10 nm (10-8 mtrs) 10 nm (10-8 mtrs)
• The inter-linked nucleotides form a polynucleotide thread. Two such threads are coiled around each other to form the DNA molecule, from which our Chromosomes are built.
• Magnification 1,000,000x.
• This is approaching the limit the electron microscopy.
• Genomics and protein folding are current major applications for supercomputing
23
1 nm (10-9 mtrs) 1 nm (10-9 mtrs)
• The molecular make-up of DNA. Numerous atoms can be seen, each group representing a differing nucleotide in an amino-acid.
• Magnification 10Mx.
• The ability of the Atomic Force Microscope to create three-dimensional micro-graphs with resolution down to the nanometer scale has made it an essential tool for imaging surfaces in applications ranging from semiconductor processing to cell biology.
24
100 pm (10-10 mtrs) 1 Å100 pm (10-10 mtrs) 1 Å
• The Carbon Atom and its surrounding electron cloud, the probability volume occupied by Carbon’s 6 electrons as defined by Heisenberg’s uncertainty principle.
• This was an elaboration of the understanding of Physical Chemistry first announced by Niels Bohr in 1913.
• Bohr was awarded the Nobel Prize for Physics in 1922.
25
1 pm (10-12 mtrs)1 pm (10-12 mtrs)
• The heart of the Carbon Atom - the Nucleus, is just visible.
• The view we have of the Nucleus of the atom is very stylised.
• No instrument exists to ‘see’ the nucleus. Experiments and theories indicate what it is made of.
• Particle Physics and the use of Accelerators (atom-smashers) have gradually revealed the sub atomic world.
26
666 or 10 666 or 10
• From the smallest known entity, at 10-15 meters, to the furthest reaches of the known Universe, at 1025 meters, there are no more than 40 orders of Magnitude (factors of 10).
• In 3-dimensions this means that the Universe is 10120 times bigger than the smallest known particle.
• So who could possibly need a computer that could handle numbers up to or beyond 9.999999999 x 10120
?
27
0.01
0.1
1
10
100
1000
1970 1975 1980 1985 1990 1995
Per
form
an
ce
(MIP
S)
Year
4004
808080286
80386
Pentium 2
Pentium
80486
Moore’s LawMoore’s Law
28
0,5 µ
0,25 µ
ScalingFaktor 2:
• 1/2 width • 1/2 hight • 1/2 switching time
8 x performance!
29
The end of Moore’s LawThe end of Moore’s Law
1960 1970 1980 1990 2000 2010 2020 2030
0,01
0,1
1
10
Size of minimal transistor
ca. 0,03
30
1 to 63
64 to 255
256 to 1023
1024 and more
0
50
100
150
200
250
300
350
400
450
May
-93
Nov-93
May
-94
Nov-94
May
-95
Nov-95
May
-96
Nov-96
May
-97
Nov-97
Number of Systems
Nov-98
May
-99
May
-00
Nov-00
Nov-01
Nov-02
31
Switzerland 5 8 Luxembourg 0 6Scandinavia 12 8 Australia 5 3New Zealand 1 1Mexico 1 4 Brazil 0 1Canada 6 9Korea 3 4 Taiwan 0 2China 0 2Singapore 0 1
Industry Research Academic MIN1998 180 180 98 172000 260 116 71 43
231 new computers entered/left the list within 6 months: May to November 2000
32
33
34
35
36
37
38
39
40
41
42
43
44
Pentium 2
Pentium
Pentium 4
45
aerospace engineering,
artificial intelligence and knowledge processing,
astrophysics,
atmospheric research and meteorological forecasting,
automotive design and production,
computational aerodynamics,
computer graphics and imaging, cryptographic analysis,
economic modeling,
implementation techniques and pragmatic software and architectural considerations,
integrated circuit design,
molecular biology,
motion-picture graphics,
nuclear fusion research,
performance studies,
petroleum reservoir engineering and hydrology simulations,
pharmaceutical research structural analysis and computer-aided design,
and theoretical and experimental physics
46
Count Share Rmax Rpeak Procs
N/A 240 48 % 192826 288895 129074
Telecomm 59 11.8 % 14235 21423 8508
Finance 29 5.8 % 7394 10943 4952
Automotive 28 5.6 % 6998 10827 4152
Weather and Climate Research 27 5.4 % 20455 32573 13464
Database 27 5.4 % 7021 11136 4712
Geophysics 23 4.6 % 7737 23153 18467
Energy 10 2 % 12766 21453 17692
Information Processing Service 9 1.8 % 2271 3477 1532
Aerospace 8 1.6 % 6836 11424 5688
Manufacturing 8 1.6 % 1701 2558 1248
Information Service 6 1.2 % 1299 1844 784
WWW 5 1 % 1431 2258 2192
Benchmarking 3 0.6 % 2351 2731 1152
Life Science 3 0.6 % 1083 1600 928
Electronics 3 0.6 % 677 1013 384
Weather Forecasting 2 0.4 % 3028 5316 1808
Defense 2 0.4 % 931 1457 656
Chemistry 2 0.4 % 462 790 926
Pharmaceutics 1 0.2 % 536 765 510
Consulting 1 0.2 % 213 336 96
Biology 1 0.2 % 205 614 512
Mechanics 1 0.2 % 199 230 16
Software 1 0.2 % 197 259 144
Transportation 1 0.2 % 196 282 128
Total 500 100 % 293058 457365 219725
47
Physical limitsPhysical limits
c=300 000 km/sec OPS -- 0.3 mm/OP1210
9101000 PEs with OPS --30cm/OP
massive parallelism
distributed memory
48
The internet:108 idle computerslets use them!
Limits through network speedLimits through network speed
10-9sec instruction cycle10-1sec signal runtime
49
Suitable problemsSuitable problems
• Parallelism
• No parallelism ?
• Pipelining
50
Amdahl’s LawAmdahl’s Law
seq par
1
seq par
speedup < 1/seq
51
PredictionsPredictions
•Massive parallel special purpose •Cost * computation time
•Ease of use: time to program•Time to develop hardware
• Parallel computers with standard components • Imbedded massively parallel systems
52
• Slowdown of sequential speedup (Moore)
Up!
Demand forparallel systems?
What kind of systems?
53
VLSIVLSI
Very
Large
Scale
Integration
• simple cells
• few types
• regular architecture
• short connections
mesh -- torus
54
Mesh/TorusMesh/Torus
Diameter ( ) bisection width ( )
nn
2D mesh
55
Architecture of Systola 1024
Interface processors
ISA
RAM NORTH
host computer bus
Controller
RAM WEST
program memory
M. KundeH.W. LangM. SchimmlerH. SchmeckH. Schröder
Special features of the ISA:•fast local communication•aggregate functions with constant period•fast integer arithmetic
56
C:=C+CW
C:=C+CN
sum
“don’t”
“don’t”
57
58
59
60
61
62
63
64
65
SUM(C )ij
aggregate functionsconstant period !
66
Areas of application for ISA:automatic optical quality control
real time signal processingcomputer graphics /visualization ?linear equationsCryptography --> Tele-medicine ?
Special features of the ISA:fast aggregate functions (sum, carry)fast local communicationno local memorytypical improvement over PC: Factor 20-30
67
Implementation of Backprojection
g( , )x y
t
g( cos sin , )x yt t t
Tomography
68
robot visionrobot vision
projectorCCD CCD
Scan in objects
Scan in bodies ?
Robot visionmedical applications
69
ISA: Image classificationISA: Image classification
70
Spiral (Rein Warmels)Spiral (Rein Warmels)
Wavelet transform
71
Change viewpoint
Change transparency
cNon-uniform
72
1 µ
Systola 1024, 50 MHz
0.09 µ
Systola 2003, 1 GHz
Next generation Systola,performance prediction:Factor 120 (scaling the area)Factor 20 (scaling the speed)Factor 6 (chip area)
Factor 14,400/chip
limit?0.03 µ
73
Disadvantages of the mesh:
large diameter!
low bisection width!
74
cluster1024
10241024
1024
1024
1024
Massive parallelSYSTOLA 1024
Hybrid computing
Topology?
75
reconfigurable meshreconfigurable mesh
reconfigurable mesh =mesh + interior connections
15 positions
low cost
diameter = 1
76
modulo 3 countermodulo 3 counter
10 11 10
*1 mod 3
Constant time on RM butlog n / log log n on CRCW-PRAM
Configurational computing!
77
Reconfigurable meshReconfigurable mesh
Special featuresSIMDconstant diameterfaster than PRAM ?Suitable applicationsrouting/sorting/load balancingsparse matrix multiplicationsegmentation / component labelingfeature extractionimage database ?
78
Optical HighwayOptical Highway
6
3
10
#
C
widthW
processorsP
CWP W=1; P=100W=32; P=32
C
All-to-all connection
79
Horizontal all-to-all
Verticalall-to-all
80
Features of optically connected meshesSIMD/SPMD/MIMD ?implement all major architecturesall-to-all communication in 2 stepsBulk synchronous processing (BSP)no latency hidingno pin-limitationApplicationscoarse grain parallel computing only?ray-tracing ????
81
1024
10241024
1024
1024
1024
3D-problems
Hybrid
PRAM equivalent?High bisection width bound
OH
2D-problems, local communication
ISA
Low cost
diameter-bound >bisection-width-bound
RM
Future ?
82
Content• Parallel computing (not distributed)• Supercomputing• Systolic arrays, embedded systems• Fault tolerant parallel systems• Standard architectures – standard control • Future architecture – future control ??• NP-hard problems• Develop the skills to design embedded systems
83
??
??
Recommended