Upload
opal-porter
View
216
Download
2
Tags:
Embed Size (px)
Citation preview
Albert-Einstein-Institut www.aei-potsdam.mpg.de
Using Supercomputers to Collide Black HolesSolving Einstein’s Equations on the Grid
• Solving Einstein’s Equations, Black Holes, and Gravitational Wave Astronomy
• Cactus, a new community simulation code framework– Toolkit for any PDE systems, ray tracing, etc...
– Suite of solvers for Einstein and astrophysics systems
• Recent Simulations using Cactus– Black Hole Collisions, Neutron Star Collisions
– Collapse of Gravitational Waves
• Grid Computing, remote collaborative tools: what a scientist really wants and needs
Ed SeidelAlbert-Einstein-InstitutMPI-Gravitationsphysik& NCSA/U of IL
Ed SeidelAlbert-Einstein-InstitutMPI-Gravitationsphysik& NCSA/U of IL
What we will be able to do with this new technology…?
Albert-Einstein-Institut www.aei-potsdam.mpg.de
Einstein’s Equations and Gravitational WavesThis community owes a lot to Einstein...
• Einstein’s General Relativity– Fundamental theory of Physics (Gravity)– Among most complex equations of physics
• Dozens of coupled, nonlinear hyperbolic-elliptic equations with 1000’s of terms
– Predict black holes, gravitational waves, etc.– Barely have capability to solve after a century
• This is about to change...
• Exciting new field about to be born: Gravitational Wave Astronomy– Fundamentally new information about Universe– What are gravitational waves??: Ripples in spacetime curvature, caused by matter motion,
causing distances to change:
• A last major test of Einstein’s theory: do they exist?– Eddington: “Gravitational waves propagate at the speed of thought”– This is about to change...
Albert-Einstein-Institut www.aei-potsdam.mpg.de
Multi-Teraflop Computation, AMR, Elliptic-Hyperbolic
Numerical Relativity
Waveforms: We Want to Compute What Actually Happens in Nature..Can’t do this now, but this is about to change...
Albert-Einstein-Institut www.aei-potsdam.mpg.de
Computational Needs for 3D Numerical Relativity:Can’t fulfill them now, but about to change...
• InitialData: 4 coupled nonlinear elliptics• Evolution
• hyperbolic evolution• coupled with elliptic eqs.
t=0
t=100
Multi TFlop, Tbyte machine essential
• Explicit Finite Difference Codes– ~ 104 Flops/zone/time step– ~ 100 3D arrays
• Require 10003 zones or more– ~1000 Gbytes– Double resolution: 8x memory, 16x Flops
• Parallel AMR, I/O essential• A code that can do this could be useful to
other projects (we said this in all our grant proposals)!– Last few years devoted to making this useful
across disciplines…– All tools used for these complex simulations
available for other branches of science, engineering…
• Scientist/engineer wants to know only that!– But what algorithm? architecture?
parallelism?, etc...
Albert-Einstein-Institut www.aei-potsdam.mpg.de
Grand Challenges: NSF Black Hole and NASA Neutron Star ProjectsCreating the momentum for the future...
• University of Texas (Matzner, Browne, Choptuik), • NCSA/Illinois/AEI (Seidel, Saylor, Smarr, Shapiro,
Saied)• North Carolina (Evans, York)• Syracuse (G. Fox)• Cornell (Teukolsky)• Pittsburgh (Winicour)• Penn State (Laguna, Finn)
• NCSA/Illinois/AEI (Saylor, Seidel, Swesty, Norman)• Argonne (Foster)• Washington U (Suen)• Livermore (Ashby)• Stony Brook (Lattimer)
NEW!EU Network Entire community
about to become Grid-enabled...
Albert-Einstein-Institut www.aei-potsdam.mpg.de
CactusNew concept in community developed simulation code infrastructure
• Developed as response to needs of these projects• Numerical/computational infrastructure to solve PDE’s• Freely available, Open Source community framework: spirit of gnu/linux
– Many communities contributing to Cactus• Cactus Divided in “Flesh” (core) and “Thorns” (modules or collections of
subroutines)– Flesh, written in C, glues together various components– Multilingual: User apps can be Fortran, C, C++; automated interface between them
• Abstraction: Cactus Flesh provides API for virtually all CS type operations– Driver functions (storage, communication between processors, etc)– Interpolation– Reduction– IO (traditional, socket based, remote viz and steering…)– Checkpointing, coordinates– Etc, etc…
• Cactus is a Grid-enabling application middleware...
Albert-Einstein-Institut www.aei-potsdam.mpg.de
How to use Cactus Features• Application scientist usually concentrates on the application...
– Performance– Algorithms– Logically: Operations on a grid (structured or unstructured (coming…))
• ...Then takes advantage of parallel API features enabled by Cactus– IO, Data streaming, remote visualization/steering, AMR, MPI, checkpointing, Grid
Computing, etc…– Abstraction allows one to switch between different MPI, PVM layers, different I/O
layers, etc, with no or minimal changes to application!• (nearly) All architectures supported and autoconfigured
– Common to develop on laptop (no MPI required); run on anything – Compaq / SGI Origin 2000 / T3E / Linux clusters + laptops / Hitachi
/NEC/HP/Windows NT/ SP2, Sun• Metacode Concept
– Very, very lightweight, not a huge framework (not Microsoft Office)– User specifies desired code modules in configuration files– Desired code generated, automatic routine calling sequences, syntax checking, etc…– You can actually read the code it creates...
• http://www.cactuscode.org
Albert-Einstein-Institut www.aei-potsdam.mpg.de
Modularity of Cactus...
Application 1
Cactus Flesh
Application 2 ...
Sub-app
AMR (GrACE, etc)
MPI layer 3 I/O layer 2
Unstructured...
Globus Metacomputing Services
User selectsdesired functionality…Code created...
Abstractions...
Remote Steer 2MDS/Remote Spawn
Legacy App 2
Albert-Einstein-Institut www.aei-potsdam.mpg.de
Computational Toolkit: provides parallel utilities (thorns) for computational scientist
• Cactus is a framework or middleware for unifying and incorporating code from Thorns developed by the community– Choice of parallel library layers (Native MPI, MPICH, MPICH-G(2), LAM,
WMPI, PACX and HPVM)
– Various AMR schemes: Nested Boxes, GrACE, Coming: HLL, Chombo, Samrai, ???
– Parallel I/O (Panda, FlexIO, HDF5, etc…)
– Parameter Parsing
– Elliptic solvers (Petsc, Multigrid, SOR, etc…)
– Visualization Tools, Remote steering tools, etc…
– Globus (metacomputing/resource management)
– Performance analysis tools (Autopilot, PAPI, etc…)
– Remote visualization and steering
– INSERT YOUR CS MODULE HERE...
PAPI
GrACE/DAGH
Albert-Einstein-Institut www.aei-potsdam.mpg.de
High performance: Full 3D Einstein Equations solved on NCSA NT Supercluster, Origin 2000, T3E
Cactus Scaling on T3E-600
192
760
5980
47900
100
1000
10000
100000
1 10 100 1000
Number of Processors
Cactus on T3E 600 Total Mflops/sec
• Excellent scaling on many architectures– Origin up to 256 processors
– T3E up to 1024
– NCSA NT cluster up to 128 processors
• Achieved 142 Gflops/s on 1024 node T3E-1200 (benchmarked for NASA NS Grand Challenge)
• Scaling to thousands of processors possible, necessary...
• But, of course, we want much more… Grid Computing
Albert-Einstein-Institut www.aei-potsdam.mpg.de
Geophysics(Bosl)
Numerical Relativity CommunityCornell
Crack prop.
NASA NS GC
Livermore
SDSS(Szalay)
Intel
Microsoft
Clemson
“Egrid”NCSA, ANL, SDSC
Cactus Community Development Projects
AEI Cactus Group(Allen)
NSF KDI(Suen)
EU Network(Seidel)
Astrophysics(Zeus)
US Grid Forum
DLR
DFN Gigabit(Seidel)
“GRADS”(Kennedy, Foster,
Dongarra, et al)
ChemEng(Bishop)
San Diego, GMD, Cornell
Berkeley
Albert-Einstein-Institut www.aei-potsdam.mpg.de
The Relativity Community takes a step forwardGreat progress, but computers still too smallBiggest computations ever: 256 proc O2K at NCSA 225,000 SU’s, 1Tbyte Output Data in a Few Weeks
• Neutron Stars– Developing capability to do full GR
hydro
– Now can follow full orbits!
• Black Holes (prime source for GW)– Increasingly complex collisions: now doing full 3D grazing collisions
• Gravitational Waves– Study linear waves as testbeds– Move on to fully nonlinear waves– Interesting Physics: BH formation in full 3D!
Albert-Einstein-Institut www.aei-potsdam.mpg.de
Evolving Pure Gravitational Waves• Einstein’s equations nonlinear, so low amplitude waves just propagate
away, but large amplitude waves may…– Collapse on themselves under their own self-gravity and actually form black holes
• Use numerical relativity: Probe GR in highly nonlinear regime– Form BH?, Critical Phenomena in 3D?, Naked singularities?
– … Little known about generic 3D behavior
• Take “Lump of Waves” and evolve– Large amplitude: get BH to form!
– Below critical value: disperses and can evolve “forever” as system returns to flat space
• We are seeing hints of critical phenomena, known from nonlinear dynamics
• But need much more power to explore details, discover new physics...
Albert-Einstein-Institut www.aei-potsdam.mpg.de
Comparison: sub vs. super-critical solutions
Newman-Penrose 4 (showing gravitational waves)with lapse underneath
QuickTime™ and aMotion JPEG A decompressor
are needed to see this picture.
QuickTime™ and aMotion JPEG A decompressor
are needed to see this picture.
Subcritical: no BH forms
Supercritical: BH forms!
Albert-Einstein-Institut www.aei-potsdam.mpg.de
Numerical Black Hole Evolutions
• Binary IVP: Multiple Wormhole Model, other models
• Black Holes good candidates for Gravitational Waves Astronomy– ~ 3 events per years within 200Mpc
– Very strong sources
– But what are the waveforms?
• GW astronomers want to know!
S1 S2
P1
P2
Albert-Einstein-Institut www.aei-potsdam.mpg.de
First Step: Full 3D Numerical EvolutionHead-on, Equal Mass, BH Collisions (Misner Data) in 3D: 512 node CM-5
Event Horizon shown in green.(representing gravitational waves)shown in blue-yellow
QuickTime™ and aVideo decompressor
are needed to see this picture.
Albert-Einstein-Institut www.aei-potsdam.mpg.de
First 3D “Grazing Collision of 2 Black Holes”: Big Step: Spinning, “orbiting”, unequal mass BHs merging.
Evolution of waves
Horizon merger
Alcubierre et alresults
3843, 100GB simulation,Largest production relativity256 Processor Origin 2000 at NCSASimulation, ~500GB output data
QuickTime™ and aMotion JPEG B decompressor
are needed to see this picture.
Albert-Einstein-Institut www.aei-potsdam.mpg.de
Future view: much of it here already...• Scale of computations much larger
– Complexity approaching that of Nature
– Simulations of the Universe and its constituents• Black holes, neutron stars, supernovae• Airflow around advanced planes, spacecraft• Human genome, human behavior
• Teams of computational scientists working together– Must support efficient, high level problem description
– Must support collaborative computational science
– Must support all different languages
• Ubiquitous Grid Computing– Very dynamic simulations, deciding their own future
– Apps find the resources themselves: distributed, spawned, etc...
– Must be tolerant of dynamic infrastructure (variable networks, processor availability, etc…)
– Monitored, viz’ed, controlled from anywhere, with colleagues anywhere else...
Albert-Einstein-Institut www.aei-potsdam.mpg.de
Our Team Requires Grid Technologies, Big Machines for Big Runs
WashU
NCSA
Hong Kong
AEI
ZIB
Thessaloniki
How Do We:• Maintain/develop Code?• Manage Computer Resources?• Carry Out/monitor Simulation?
Paris
Albert-Einstein-Institut www.aei-potsdam.mpg.de
What we need and want in simulation science: a higher level Portal to provide the following...
• Got idea? Configuration manager: Write Cactus module, link to other modules, and…• Find resources
– Where? NCSA, SDSC, Garching, Boeing…???– How many computers? Distribute Simulations?– Big jobs: “Fermilab” at disposal: must get it right while the beam is on!
• Launch Simulation– How do get executable there?– How to store data?– What are local queue structure/OS idiosyncracies?
• Monitor the simulation– Remote Visualization live while running
• Limited bandwidth: compute viz. inline with simulation• High bandwidth: ship data to be visualized locally
– Visualization server: all privileged users can login and check status/adjust if necessary• Are parameters screwed up? Very complex!• Call in an expert colleague…let her watch it too
– Performance: how efficient is my simulation? Should something be adjusted?• Steer the simulation
– Is memory running low? AMR! What to do? Refine selectively or acquire additional resources via Globus? Delete unnecessary grids? Performance steering...
• Postprocessing and analysis– 1TByte output at NCSA, research groups in St. Louis and Berlin…how to deal with this?
Albert-Einstein-Institut www.aei-potsdam.mpg.de
Cactus Computational Toolkit
Science, Autopilot, AMR, Petsc, HDF, MPI, GrACE, Globus, Remote Steering...
A Portal to Computational Science: The Cactus Collaboratory
1. User has scienceidea...
3. Selects Appropriate Resources...
5. Collaborators log in to monitor...
4. Steers simulation, monitors performance...
2. Composes/Builds Code Components w/Interface...
Want to integrate and migrate this technology to the generic user…
Albert-Einstein-Institut www.aei-potsdam.mpg.de
Grid-Enabled Cactus (static version)
• Cactus and its ancestor codes have been using Grid infrastructure since 1993 (part of famous I-Way of SC’95)
• Support for Grid computing was part of the design requirements
• Cactus compiles “out-of-the-box” with Globus [using globus device of MPICH-G(2)]
• Design of Cactus means that applications are unaware of the underlying machine/s that the simulation is running on … applications become trivially Grid-enabled
• Infrastructure thorns (I/O, driver layers) can be enhanced to make most effective use of the underlying Grid architecture
Albert-Einstein-Institut www.aei-potsdam.mpg.de
Grid Applications so far...• SC93 - SC2000• Typical scenario
– Find remote resource
(often using multiple computers)– Launch job
(usually static, tightly coupled)– Visualize results
(usually in-line, fixed)
• Need to go far beyond this– Make it much, much easier
• Portals, Globus, standards
– Make it much more dynamic, adaptive, fault tolerant– Migrate this technology to general user
Metacomputing the Einstein Equations:Connecting T3E’s in Berlin, Garching, San Diego
Albert-Einstein-Institut www.aei-potsdam.mpg.de
Dynamic Distributed ComputingStatic grid model works only in special cases; must make apps
able to respond to changing Grid environment...• Make use of
– Running with management tools such as Condor, Globus, etc.– Service providers (Entropia, etc)
• Code as Information server, manager– Scripting thorns (management, launching new jobs, etc)– Dynamic use of MDS for finding available resources: code decides where to go, what
to do next!
• Applications– Portal for simulation launching and management– Intelligent parameter surveys (Cactus control thorn)– Spawning off independent jobs to new machines e.g. analysis tasks– Dynamic staging … seeking out and moving to faster/larger/cheaper machines as they
become available (Cactus worm)– Dynamic load balancing (e.g. inhomogeneous loads, multiple grids)– Etc…many new computing paradigms
Albert-Einstein-Institut www.aei-potsdam.mpg.de
Remote Visualization and Steering
Remote Viz data
Remote Viz data
HTTP
HDF5
Amira
Any Viz Client
Changing any steerable parameter• Parameters• Physics, algorithms• Performance
IsoSurfaces and GeodesicsComputed inline with simulation
Only geometry sent across network
OpenDX
Arbitrary Grid Functions Streaming HDF5
Albert-Einstein-Institut www.aei-potsdam.mpg.de
Remote Offline Visualization
Viz Client (Amira)
HDF5 VFD
DataGrid (Globus)
DPSS FTP HTTP
VisualizationClient
DPSS Server
FTP Server
Web Server Remote
Data Server
Downsampling, hyperslabs
Viz in Berlin
4TB distributed across NCSA/ANL/Garching
Only what is needed
Albert-Einstein-Institut www.aei-potsdam.mpg.de
Grand PictureRemote steering and monitoring
from airport
Origin: NCSA
Remote Viz in St Louis
T3E: Garching
Simulations launched from Cactus Portal
Grid enabled Cactus runs on
distributed machines
Remote Viz and steering from Berlin
Viz of data from previous simulations in SF café
DataGrid/DPSSDownsampling
Globus
http
HDF5
IsoSurfaces
Albert-Einstein-Institut www.aei-potsdam.mpg.de
Egrid and Grid Forum Activities• Grid Forum: http://www.gridforum.org
– Developed in US over last 18 months
– ~ 200 members (?)
– Meeting every 3-4 months
– Many working groups discussing grid software, standards, grid techniques, scheduling, applications, etc.
• Egrid: http://www.egrid.org– European initiative now 6 months old
– About 2 dozen sites in Europe
– Similar goals, but with European identity
• Next meeting: Oct 15-17 in Boston
• We hope to enlist many more application groups to drive Grid Development– Cactus Grid Application Development Toolkit
Albert-Einstein-Institut www.aei-potsdam.mpg.de
Present Testbeds (a sampling…)• Cactus Virtual Machine Room
– Small version of Alliance VMR with European sites (NCSA, ANL, UNM, AEI, ZIB)
– Portal allows access for users to all machines, queues, etc, without knowing local passwords, batch systems, file systems, OS, etc...
– Developed in collaboration with NSF KDI project, AEI, DFN-Verein
– Built on Globus services
– Will copy and develop Egrid version, hopefully tomorrow...
• Egrid Demos– Developing Cactus Worm demo for SC2000!
– Cactus simulation runs, queries MDS, finds next resource, migrates itself to next site, runs, and continues around Europe, with continuous remote viz and control...
• Big Distributed Simulation– Old static model: harness as many supercomputers as possible
– Go for a Tflop,even with tightly coupled simulation distributed across continents
– Developing techniques to make bandwidth/latency tolerant simulations...
Albert-Einstein-Institut www.aei-potsdam.mpg.de
Further details...• Cactus
– http://www.cactuscode.org
– http://www.computer.org/computer/articles/einstein_1299_1.htm
• Movies, research overview (needs major updating)– http://jean-luc.ncsa.uiuc.edu
• Simulation Collaboratory/Portal Work: – http://wugrav.wustl.edu/ASC/mainFrame.html
• Remote Steering, high speed networking– http://www.zib.de/Visual/projects/TIKSL/
– http://jean-luc.ncsa.uiuc.edu/Projects/Gigabit/
• EU Astrophysics Network– http://www.aei-potsdam.mpg.de/research/astro/eu_network/index.html