Upload
grant-lucas
View
214
Download
0
Tags:
Embed Size (px)
Citation preview
The Once and Future SciDAC
Thom H. Dunning, Jr.National Center for Supercomputing Applications
and Department of ChemistryUniversity of Illinois at Urbana-Champaign
National Center for Supercomputing Applications
QuickTime™ and aTIFF (Uncompressed) decompressor
are needed to see this picture.
with apologies to T. H. White
University of Illinois at Urbana-Champaign
National Center for Supercomputing Applications
SciDAC: The Program
“Advances in the simulation of complex scientific and
engineering systems provide an unparalleled opportunity for solving major problems that face the nation in the
21st Century.”
National Center for Supercomputing Applications
SciDAC Goals
• Scientific Application Codes– Develop mathematical models, computational methods, and scientific
codes to take full advantage of the capabilities of terascale computers
• Computing Systems and Mathematical Software– Develop software infrastructure to accelerate the development of
scientific codes, achieve maximum efficiency on high-end computers, and enable a broad range scientists to use simulation in their research
• Collaboratory Software– Develop network technologies and collaboration tools to link
geographically separated researchers, to facilitate movement of large (petabyte) data sets, and to ensure that academic scientists can fully participate in these activities
Create a Scientific Computing Software Infrastructure that bridges the gap between applied mathematics & computer science and computational science in the physical, chemical, biological, and environmental sciences:
National Center for Supercomputing Applications
SciDAC Goals II
• Flagship Computing Facility– To provide computing resources to address a broad range of
scientific problems
• Topical Computing Facilities– To ensure that the most effective and efficient resources are used
to solve each class of problems
• Experimental Computing Facilities– To guide advances in computer technology to ensure that scientific
computing has the resources that it needs in the future
• ESNet– To support research in a connected world
Create a Scientific Computing Hardware Infrastructure that is robust, agile, and flexible:
SciDAC: Circa 2001
BES, BERFES, HENPASCR
HardwareInfrastructure Software Infrastructure
SCIENTIFIC
CODES
SI
MULATION
OPERATING
SYSTEM
Data Analysis &Visualization
Scientific DataManagement
Problem-solvingEnvironments
ProgrammingEnvironments
DATAGRIDS
COLLABORATORIES
MATHEMATICS
COMPUTING SYSTEMSSOFTWARE
National Center for Supercomputing Applications
SciDAC Score Card
Goal Status Comments
Scientific Challenge Codes
Excellent progress in selected areas, but many areas poorly supported or even neglected
Computing & Math Software
Excellent progress, but some areas need additional support
Collaboratory Software Good progress, but little used
Flagship Computing Facility
Two facilities established, NERSC and NLCF, but …
Topical Computing Facilities
QCDOC and MSCF, but many opportunities still unexplored
Experimental Computing Facilities
Little progress
National Center for Supercomputing Applications
Central Dogma
Hardware
Software
The central dogma of SciDAC is the close coupling between computer hardware and computer software
Changes in computer hardware requires changes, often major changes, in computer software. Responding to such changes in a timely manner requires a multidisciplinary approach.
PortingRevision
Rewriting
EnhancedPerformance(can be dramatic)
SciDACMultidisplinary Teams
The Coming Revolutionin Computing
“The Free Lunch Is Over: A Fundamental Turn Toward Concurrency in Software”
Herb Sutter in Dr. Dobb’s Journal 30(3), March 2005
National Center for Supercomputing Applications
The GHz Race
At the 2000 IEEE International Electron Devices Meeting, Intel announced that it expected to produce a 10 GHz microprocessor by 2005.
The fastest Intel microprocessor today runs at 3.8 GHz (Intel Pentium 4). It was introduced six months ago.
At its presentation of the 6XX series of Prescott, Intel stated that it is committed to “adding value beyond GHz.”
National Center for Supercomputing Applications
Increasing Computer Performance
• Increasing Clock Frequency– Pentium: 60 MHz to 3,800 MHz in 12 years
– Resulted in ~80% of performance increase
National Center for Supercomputing Applications
The Heat Problem
Courtesy of Bob Colwell
Increasing Frequency
Wat
ts/c
m2
1
10
100
1000
1.5 1.0 0.7 0.5 0.35 0.25 0.18 0.13 0.1 0.07
i386i486
PentiumPentium Pro
Pentium IIPentium III
Hot Plate
Nuclear Reactor
Rocket Nozzle
Pentium 4(Prescott)
Pentium 4(Willamette)
National Center for Supercomputing Applications
Managing the Heat Load
Liquid cooling system in Apple G5s
Heat sinks in 6XX series Pentium 4s
National Center for Supercomputing Applications
Leakage Current
From Minor Nuisance to Chip Killer
Dynamic Power
Leakage Power
300
250
200
150
100
50
0250 180 130 90 70
Dissipated Power ~ CV2f
Process Technology (nm)
Pow
er (
W)
National Center for Supercomputing Applications
Means of Increasing Performance
• Increasing Clock Frequency– From 60 MHz to 3,800 MHz in 12 years
– Has resulted in ~80% of performance increase
• Execution Optimization– More powerful instructions
– Execution optimization (pipelining, branch prediction, execution of multiple instructions, reordering instruction stream, etc.)
National Center for Supercomputing Applications
Microarchitecture Trends
Adapted from Johan De Gelas, Quest for More Processing Power,AnandTech, Feb. 8, 2005.
101
102
103
104
105
106
MIP
S
1980 1985 1990 1995 2000 2005 2010
Pentium ArchitectureSuper Scalar
Pentium Pro ArchitectureSpeculative Out-of-Order
Pentium 4 ArchitectureTrace Cache
Pentium 4 and Xeon Architecture with HTMulti-Threaded
Multi-Threaded, Multi-Core
Era ofInstructionParallelism
Era ofThread
Parallelism
National Center for Supercomputing Applications
Means of Increasing Performance
• Increasing Clock Frequency– From 60 MHz to 3,800 MHz in 12 years
– Has resulted in ~80% of performance increase
• Execution Optimization– More powerful instructions
– Execution optimization (pipelining, branch prediction, execution of multiple instructions, reordering instruction stream, etc.)
• Larger Caches– On-chip caches to ameliorate the growing disparity
between processor speed and memory latency and bandwidth
National Center for Supercomputing Applications
Moore’s Law Still Holds
’60 ’65 ’70 ’75 ’80 ’85 ’90 ’95 ’00 ’05 ’10
Tra
nsi
stor
s P
er D
ie
1K4K 16K
64K256K
1M
16M4M
64M
4004
80808086
80286i386™
i486™Pentium®
MemoryMicroprocessor
Pentium® IIPentium® III
256M
Pentium® 4
Itanium®
1G2G4G
128M
Source: Intel
108
107
106
105
104
103
102
101
100
109
1010
1011
512M
National Center for Supercomputing Applications
Means of Increasing Performance
• Increasing Clock Frequency– From 60 MHz to 3,800 MHz in 12 years
– Has resulted in ~80% of performance increase
• Execution Optimization– More powerful instructions
– Execution optimization (pipelining, branch prediction, execution of multiple instructions, reordering instruction stream, etc.)
• Larger Caches– On-chip caches will continue to increase in size and
help mitigate disparities in computer subsystem performance
National Center for Supercomputing Applications
New Technologies for Computers
• Low power processors
National Center for Supercomputing Applications
IBM Blue Gene Systems
• LLNL BG/L– 360 teraflops– 64 racks
• 65,536 nodes• 131,072 processors
• Node– Two 2.8 Gflops processors
• System-on-a-Chip design• 700 MHz• Two fused multiply-adds per
cycle
– Up to 512 Mbytes of memory– 27 Watts
National Center for Supercomputing Applications
Technologies for Petascale Computers
• Low Power Processors– Need unprecedented application software scalability
• Application codes must scale to 100,000s of processors
– Need ability to recover from continual processor loss
National Center for Supercomputing Applications
New Technologies for Computers
• Low Power Processors– Need unprecedented scalability
• Application codes must scale to 100,000s of processors
– Need ability to recover from processor loss
• Multicore Chips
National Center for Supercomputing Applications
Architecture of Dual-Core Chips
• IBM Power5– Shared 1.92 Mbyte L2
cache
• AMD Opteron– Separate 1 Mbyte L2
caches– CPU0 and CPU1
communicate through the SRQ
• Intel Pentium 4– “Glued” two processors
together
National Center for Supercomputing Applications
New Technologies for Computers
• Low Power Processors– Need unprecedented scalability
• Application codes must scale to 100,000s of processors
– Need ability to recover from processor loss
• Multicore chips– Need to better understand a number of architectural
issues• Memory bandwidth
• Cache contention
• …
National Center for Supercomputing Applications
Other Promising Technologies
• Field Programmable Gate Arrays (FPGAs)– Capabilities increasing rapidly (riding silicon technology curve)
– Need efficient software development tools
• Heterogeneous Computer Systems– Different types of processors in single system
• Vector processors, superscalar processors, FPGAs
– High speed interconnect linking all processors
– May be especially advantageous for some applications, e.g., multiphysics applications
• Many Other New Ideas– DARPA: High Productivity Computing System program
– Universities: Sterling, Dally, …