View
8
Download
0
Category
Preview:
Citation preview
Sandia is a multiprogram laboratory operated by Sandia Corporation, a Lockheed Martin Company,for the United States Department of Energyʼs National Nuclear Security Administration
under contract DE-AC04-94AL85000.
The State of SimulationWhere We AreWhat is Wrong
Where We Want to BeArun Rodrigues
Tuesday, August 14, 2012
Where We Are
Tuesday, August 14, 2012
Sandia Unclassified Unlimited Release
•Multiple Objectives–Performance used to be
only criteria–Now, Energy, cost, power,
reliability, etc...•Scale & Complexity
–Many system characteristics require detail to measure
–Detailed simulation takes too long
–Application complexity increasing
•Accuracy–Systems more complex–Vendors don’t reveal
necessary details
Major Architectural Simulation Challenges Are Increasing
Tuesday, August 14, 2012
Sandia Unclassified Unlimited Release
•Multiple Objectives–Performance used to be
only criteria–Now, Energy, cost, power,
reliability, etc...•Scale & Complexity
–Many system characteristics require detail to measure
–Detailed simulation takes too long
–Application complexity increasing
•Accuracy–Systems more complex–Vendors don’t reveal
necessary details
Major Architectural Simulation Challenges Are Increasing
Tuesday, August 14, 2012
View of the Simulation Problem
Application writerspurchasersdesigners
system procurementalgorithm co-design
architecture researchlanguage research
Multiple Audiences.....Network
ProcessorSystem
present systemsfuture systemsX X X
Scale..... ManyCores
+Memory
ManyManyNodes
ManyManyMany
ThreadsX X
Multi-Physics AppsInformatics Apps
Complexity.....Communication Libraries
Run-TimesOS Effects
Existing LanguagesNew LanguagesX X
Constraints.....Performance
CostPower
ReliabilityCoolingUsability
RiskSize
Tuesday, August 14, 2012
Sandia Unclassified Unlimited Release
Simulation & CoDesign
•Multi-Level Design Strategy–Allows exploration of multiple
types of abstract machine models
–Multiple Levels of Scale–“Clearinghouse” of ideas
Simulation
Applications Architectures
Abstract Machine ModelsMini Apps
Testbeds
(EC or Classified) Proprietary
Prototypes
Back of the Envelope
Analysis Analytical Models
High Level State Machines
Behavioral Cycle-Approximate, Cycle-AccurateHardware
PrototypePrototypes
Tuesday, August 14, 2012
A (Very) Incomplete List...•Sniper•Graphite•CMPSim•SIMICS•SESC•ASPEN•OMNet++•Phoenix•RAMP•SimpleScalar•BigSim•gem5
•FAST•SST•Manifold•Zesto•DRAMSim•MacSim•TwinCAM•CAPA
•MARSx86•HotSpot•Orion•IntSim•McPAT•PIN• (at this point, I got tired)
Crudeguess Roughidea Causeandeffect Verygoodes4mates Exacthwmodel
100
101
102
103
104
105
106
107
Simula4
onsc
ope/parallelism
SpreadsheetModels
Coarse‐GrainedSoHwareSimula4on
Cycle‐AccurateSimula4on FPGAEmula4on
Infeasible
Actualhardware
HardwarePrototypes
Testbeds
Tuesday, August 14, 2012
Simulation Process
•Problem & Software•Execution vs. Trace vs. Stochastic vs. State Machine
•Model•Emulation vs. Simulation vs. FPGA•Cycle-accurate/Cycle-Approximate vs. CPI=X
•Hardware Representation•Exact & specific to very general
•Hybrid (any or all of the above)
Tuesday, August 14, 2012
Simulation Process
•Problem & Software•Execution vs. Trace vs. Stochastic vs. State Machine
•Model•Emulation vs. Simulation vs. FPGA•Cycle-accurate/Cycle-Approximate vs. CPI=X
•Hardware Representation•Exact & specific to very general
•Hybrid (any or all of the above)
Tuesday, August 14, 2012
What is Wrong
Tuesday, August 14, 2012
Best & Worst Aspects of Today’s Infrastructure
Best: Diversity
Tuesday, August 14, 2012
Best & Worst Aspects of Today’s Infrastructure
Best: Diversity
Worst: DiversityLittle Interoperability / Reuse
Poor MaintainabilityPoor Documentation“Black Box” Effect
Little Trust
Tuesday, August 14, 2012
Software Engineering•Mike Kistler (IBM)
•Ali Saidi & Steve Reihardt (ARM/AMD)
•David Wood
© 2012 IBM Corporation15 NSF CSA Workshop, May 31 - June 1, 2012Mike Kistler, IBM
Implications
!Reuse is essential
!The simulation infrastructure must be modular, with well
designed and carefully constructed interfaces and core
services
–Allow alternative processor / network / component modules to be
plugged in
–Allow caching / reuse of data / prior computations
!The simulation infrastructure must co-exist and interoperate
with a wide variety of supporting tools
!Simulator construction is primarily a software engineering
activity
–Working knowledge of hardware architecture is also valuable but
secondary
© 2012 IBM Corporation18 NSF CSA Workshop, May 31 - June 1, 2012Mike Kistler, IBM
Modular Infrastructure Design
! Interface Design
–Allow composition of models with wide variety of performance /
functionality characteristics
–Allow models to be constructed in a variety of languages
• C, C++, SystemC, Java, Python, etc.
• Also hardware-based models (FGPA)
–Allow models to be built as self-contained components
• Avoid the “build the world” approach
–Support for “binary-only” components
• Allows industry to contribute while protecting IP
!Core Services
–Must provide the “right” abstractions
–Must be high-performing
–Must enable parallel execution© 2012 IBM Corporation
15 NSF CSA Workshop, May 31 - June 1, 2012Mike Kistler, IBM
Implications
!Reuse is essential
!The simulation infrastructure must be modular, with well
designed and carefully constructed interfaces and core
services
–Allow alternative processor / network / component modules to be
plugged in
–Allow caching / reuse of data / prior computations
!The simulation infrastructure must co-exist and interoperate
with a wide variety of supporting tools
!Simulator construction is primarily a software engineering
activity
–Working knowledge of hardware architecture is also valuable but
secondary
Tuesday, August 14, 2012
Sandia Unclassified Unlimited Release
Challenge of Validation•We don’t have enough details for “fully” accurate simulation
–Vendors IP consideration–Complexity vs. Flexibility
•Standards for “how accurate is enough” vary
–Models validated at different levels, with different methodologies
•Assumptions poorly communicated
•Lot of Validation in Isolation
•Fundamental questionIs Simulation Useful?
Component Method Error
DRAMSim RTL Level validation against Micron Cycle
Generic Proc
Simplescalar SPEC92 Validation ~5%
NMSU Comparison vs. existing processors on SPEC <7%
RS Network
Latency/BW against SeaStar 1.2, 2.1 <5%
MacSim Comparison vs. Existing GPUs
Ongoing<10%
expected
Zesto Comparison vs several processors, benchmarks 4-5%
McPAT Comparisons against existing processors
10- 23%
GeM5 Comparisons against existing processors
Ongoing5-20%
Tuesday, August 14, 2012
Where We Want To Be
Tuesday, August 14, 2012
HPCAS Survey•How much do you currently use simulators? how much would you like to?
Very MuchSomewhatNo
Currently Would Like To
Very MuchSomewhat
Probably NotSomeVery Much
•Would your simulator benefit from a common Framework?
•Would there be major technical integration challenges?
Tuesday, August 14, 2012
Ideal World•Common Simulation Environment with dozens of interoperable component models
•Include models for power, energy, temperature, reliability, and cost
•Open non-restrictive license•PARALLEL & Fast•Long Term support
–Documentation–Reuse
•Accepted by community–Validated to known standard with uniform methodology–Easy to replicate results
•Multi-level: Analytical to Behavioral to Cycle-Level to Hardware in the Loop
Tuesday, August 14, 2012
Sandia Unclassified Unlimited Release
SST Simulation Project Overview
Technical Approach
Goals•Become the standard architectural simulation framework for HPC
•Be able to evaluate future systems on DOE workloads
•Use supercomputers to design supercomputers
Consortium•“Best of Breed” simulation suite•Combine Lab, academic, & industry
Status•Includes parallel simulation core, configuration, power models, basic network and processor models, and interface to detailed memory model
•http://code.google.com/p/sst-simulator/
•Parallel•Parallel Discrete Event core with conservative optimization over MPI
•Holistic•Integrated Tech. Models for power•McPAT, Sim-Panalyzer
•Multiscale•Detailed and simple models for processor, network, and memory
•Open•Open Core, non viral, modular
Tuesday, August 14, 2012
Sandia Unclassified Unlimited Release
•Component & System validation is difficult, too many unknowns for high accuracy
•Many ‘consumers’•Is simulation still useful?•Historical Analog
–US Navy would develop “Spring Style” design sketches of warships for use in war games (simulations) and as specification to “vendor” shipyards
–Combination of abstract design & simulation allow users & vendors to perform co-design
•“Software Team”: Naval War College developing tactics
•“Hardware Vendor”: Shipyards building the ship
US Navy “Spring Style”
What Can We Accomplish for Exascale?
Tuesday, August 14, 2012
Sandia Unclassified Unlimited Release
Role of Simulation for Exascale•Required for effective co-design, but need to address more audiences
•It is not our role to simulate exactly what an Exascale System will look like today
•We should be developing “Spring Styles”–Abstract models which inform the hardware designers of our
requirements–Models that the application teams can use to understand how future
architectures will impact their applications
•We need to harness diversity, not fight it•We need long-term support•We need consensus•How do we get there?
Tuesday, August 14, 2012
How do we get there?HPCAS Possible Plan of Action: Standard Simulation Interface
•Workshops to define interface
–Limited mandate
•Define minimal subset of interfaces to promote adoption
•Phased approach (core + optional chapters)
•Three phases
–Define interface sets: priority/required-or-not, then define interfaces
–Define Core interfaces
–Define Component Interfaces
•User Group to guide towards industrial strength
•Encourage multiple existing projects to adopt interface
•Calls for... / Follow-on projects for...
–Implementations of interface
–Porting components to plug into interface
•Need to start with clear, limited, attainable goals•Study examples we already have•Long-Term Multi-Agency Funding•Analogy to MPI
Tuesday, August 14, 2012
Bonus Slides
Tuesday, August 14, 2012
Sandia Unclassified Unlimited Release
Simulation is the Nexus of CoDesign•CoDesign needs a meeting point between applications and architectures
•Full Apps & Archs. are too complex to easily reason about, so we create proxies
•Simulation provides a way...–...to combine and test AMMs
and MiniApps–...for application writers to
test ideas on machines that don’t exist
–...for architects to understand evolving application proxies
Simulation
Applications Architectures
Abstract Machine ModelsMini Apps
Testbeds
(EC or Classified) Proprietary
Prototypes
Tuesday, August 14, 2012
Sandia Unclassified Unlimited Release
Who Can Save Us?•Why Not Just Industry?
–Industry focused on specific products, often no system view–Labs are ‘neutral’ - able to work with everyone, don’t compete with
anyone, able to keep a secret–We can work with Industry and academia to provide a system &
application view they might not have•Why Not Just Academia?
–Labs can provide long-term support, software engineering (i.e. we can work on something even if it doesn’t become a paper)
•Labs bring–Neutrality–Focus: HPC Long-Term Development System-Level–Application Knowledge–Look for big changes - out the box ideas, long term
Tuesday, August 14, 2012
Sandia Unclassified Unlimited Release
Component Validation•Strategy: component validation in parallel with system-level validation
•Current components validated at different levels, with different methodologies
•Validation in isolation
•What is needed–Uniform validation
methodology (apps)–System (multi-component)
level validation–Learn from multi-level
physics apps
Component Method Error
DRAMSim RTL Level validation against Micron Cycle
Generic Proc
Simplescalar SPEC92 Validation ~5%
NMSU Comparison vs. existing processors on SPEC <7%
RS Network
Latency/BW against SeaStar 1.2, 2.1 <5%
MacSim Comparison vs. Existing GPUs
Ongoing<10%
expected
Zesto Comparison vs several processors, benchmarks 4-5%
McPAT Comparisons against existing processors
10- 23%
GeM5 Comparisons against existing processors
Ongoing5-20%
Tuesday, August 14, 2012
Recommended