Upload
logan-frederick-dalton
View
214
Download
0
Embed Size (px)
Citation preview
EEL5708/BölöniLec 2.1
Fall 2004
August 27, 2004
Lotzi Bölöni
Fall 2004
EEL 5708High Performance Computer Architecture
Lecture 2
Introduction: the big picture
EEL5708/BölöniLec 2.2
Fall 2004
Acknowledgements
• All the lecture slides were adopted from the slides of David Patterson (1998, 2001) and David E. Culler (2001), Copyright 1998-2002, University of California Berkeley
EEL5708/BölöniLec 2.3
Fall 2004
Research Paper Reading
• As graduate students, you are now researchers.
• Most information of importance to you will be in research papers.
• Ability to rapidly scan and understand research papers is key to your success.
• So: about 1 paper / week in this course– Quick 1 paragraph summaries will be due as homework– Important supplement to book.– Will discuss papers in class
• Links to the papers will be posted on the course webpage
EEL5708/BölöniLec 2.4
Fall 2004
First reading
• G.Amdahl, G.A.Blaauw, F.P. Brooks, Jr– Architecture of the IBM System 360
• Link from the course website• A good paper to improve your skills in
reading papers.
EEL5708/BölöniLec 2.5
Fall 2004
Why take EEL5708?
• To design the next great instruction set?...well...– instruction set architecture has largely converged– especially in the desktop / server / laptop space– dictated by powerful market forces
• Tremendous organizational innovation relative to established ISA abstractions
• Many new instruction sets or equivalent– embedded space, controllers, specialized devices, ...
• Design, analysis, implementation concepts vital to all aspects of EE & CS– systems, PL, theory, circuit design, VLSI, comm.
• Equip you with an intellectual toolbox for dealing with a host of systems design challenges
EEL5708/BölöniLec 2.6
Fall 2004
Example Hot Developments ca. 2002
• Manipulating the instruction set abstraction– Itanium: translate ISA64 -> micro-op sequences– Pentium IV - hyperthreading– Transmeta: continuous dynamic translation of IA32– Tensilica: synthesize the ISA from the application– reconfigurable HW
• Virtualization– vmware: emulate full virtual machine– JIT: compile to abstract virtual machine, dynamically
compile to host
• Parallelism– wide issue, dynamic instruction scheduling, EPIC– multithreading (SMT)– chip multiprocessors
• Communication– network processors, network interfaces
• Exotic explorations– nanotechnology, quantum computing
EEL5708/BölöniLec 2.7
Fall 2004
Forces on Computer Architecture
ComputerArchitecture
Technology ProgrammingLanguages
OperatingSystems
History
Applications
(A = F / M)
EEL5708/BölöniLec 2.8
Fall 2004
Amazing Underlying Technology Change
EEL5708/BölöniLec 2.9
Fall 2004
Original
Big Fishes Eating Little Fishes
EEL5708/BölöniLec 2.10
Fall 2004
1988 Computer Food Chain
PCWork-stationMini-
computer
Mainframe
Mini-supercomputer
Supercomputer
Massively Parallel Processors
EEL5708/BölöniLec 2.11
Fall 2004
1998 Computer Food Chain
PCWork-station
Mainframe
Supercomputer
Mini-supercomputerMassively Parallel Processors
Mini-computer
Now who is eating whom?
Server
EEL5708/BölöniLec 2.12
Fall 2004
Why Such Change in 10 years?
• Performance– Technology Advances
» CMOS VLSI dominates older technologies (TTL, ECL) in cost AND performance
– Computer architecture advances improves low-end » RISC, superscalar, RAID, …
• Price: Lower costs due to …– Simpler development
» CMOS VLSI: smaller systems, fewer components– Higher volumes
» CMOS VLSI : same dev. cost 10,000 vs. 10,000,000 units
– Lower margins by class of computer, due to fewer services
• Function– Rise of networking/local interconnection technology
EEL5708/BölöniLec 2.13
Fall 2004
Year
Tra
nsis
tors
1000
10000
100000
1000000
10000000
100000000
1970 1975 1980 1985 1990 1995 2000
i80386
i4004
i8080
Pentium
i80486
i80286
i8086
Technology Trends: Microprocessor Capacity
CMOS improvements:• Die size: 2X every 3 yrs• Line width: halve / 7 yrs
“Graduation Window”
ATI Radeon 9700: 110 million(graphics processor)
Pentium 4: 55 millionAthlon XP: 37.5 millionAlpha 21264: 15 millionPentium Pro: 5.5 millionPowerPC 620: 6.9 millionAlpha 21164: 9.3 millionSparc Ultra: 5.2 million
Moore’s Law
EEL5708/BölöniLec 2.14
Fall 2004
Processor PerformanceTrends
Microprocessors
Minicomputers
Mainframes
Supercomputers
Year
0.1
1
10
100
1000
1965 1970 1975 1980 1985 1990 1995 2000
EEL5708/BölöniLec 2.15
Fall 2004
Memory Capacity (Single Chip DRAM)
size
Year
Bit
s
1000
10000
100000
1000000
10000000
100000000
1000000000
1970 1975 1980 1985 1990 1995 2000
year size(Mb) cyc time
1980 0.0625 250 ns
1983 0.25 220 ns
1986 1 190 ns
1989 4 165 ns
1992 16 145 ns
1996 64 120 ns
2000 256 100 ns
EEL5708/BölöniLec 2.16
Fall 2004
Technology Trends(Summary)
Capacity Speed (latency)
Logic 2x in 3 years 2x in 3 years
DRAM 4x in 3 years 2x in 10 years
Disk 4x in 3 years 2x in 10 years
EEL5708/BölöniLec 2.17
Fall 2004
Technology Trends
• Clock Rate: ~30% per year• Transistor Density: ~35%• Chip Area: ~15%• Transistors per chip: ~55%• Total Performance Capability: ~100%• by the time you graduate...
– 3x clock rate (3-4 GHz)– 10x transistor count (1 Billion transistors)– 30x raw capability
• plus 16x dram density, 32x disk density
EEL5708/BölöniLec 2.18
Fall 2004
Newest trends (Fall 2004)
• Moore’s law is probably over. • Future VLSI improvements will probably be
linear (as opposed to exponential).• Multi-core chips will be the new standard,
from as early as 2005.• Parallel programs will become much more
important, even for mainstream.• And many developments which we can not
foresee at this moment.
EEL5708/BölöniLec 2.19
Fall 2004
What is “Computer Architecture”?
I/O systemInstr. Set Proc.
Compiler
OperatingSystem
Application
Digital DesignCircuit Design
Instruction Set Architecture
Firmware
•Coordination of many levels of abstraction•Under a rapidly changing set of forces•Design, Measurement, and Evaluation
Datapath & Control
Layout