Computer Science Issues forLarge Scale Applications
Gabrielle AllenDepartment of Computer ScienceCenter for Computation & TechnologyLouisiana State Universityhttp://www.cct.lsu.edu/~gallen
2
Summary
• My research at LSU– Theoretical: algorithms, abstractions, scalability,
interoperability– Applications: astrophysics, coastal, biology, CFD,
petroleum, …– Real implementations: Cactus, GAT, SAGA
• Building infrastructure– Creating and building CCT, LONI, etc
• Education & training– IGERT interdisciplinary program– Student programs at CCT
• See web page http://www.cct.lsu.edu/~gallenfor publications/details.
3
Outline
• Motivating scenario from Astrophysics• Computational requirements• Infrastructure
– People, Hardware, Collaboration• Frameworks for HPC and Grids
– Cactus Code– Grid Application Toolkit– Grid portals, remote visualization, data archives
• Coastal modeling and other applications• Education
4
Large ScaleComputational Needs
5
• Cosmology
• Black Holes &Neutron Stars
• Supernovae
• AstronomicalDatabases
• GravitationalWave DataAnalysis
• Drive HEC &Grids
GDA, T. Goodale, G. Lanfermann, T. Radke, E. Seidel, W. Benger, H. Hege, A. Merzky, J. Masso and J. Shalf,Solving Einstein's Equations on Supercomputers, IEEE Computer, 32, (1999).
GDA and Ed Seidel, Collaborative Science: Astrophysics Requirements and Experiences, in The Grid: Blueprint fora New Computing Infrastructure (2nd Edition), Ed: Ian Foster and Carl Kesselmann, p. 201-213, (2004).
Astrophysics Challenge Problems
6
Gravitational Wave Physics
ObservationsModels
Analysis & Insight
Complex algorithms,data structures, lage
scale needs,community needs to
share results
Real time and historicaldata analysis, algorithmsfor template matching,
data locations, metadataetc.
Large scale dataneeds, networks,
on-demandmodels to steer
opticaltelescopes.
7
• Einstein Equations: Gµν(γij) = 8πTµν
– Constraint Equations:• 4 coupled elliptic equations for initial data and beyond• Familiar from Newton: ∇2 φ = 4πρ
– 12 fully 2nd order evolution equations for γij, Kij (∂γij /∂t)• Like a “wave equation” ∂2φ/dt2 −∇2 φ = Source (φ, φ2, φ’)• Thousands of terms in RHS (automatic code generation)
– 4 gauge conditions• Elliptic, hyperbolic, whatever …
– GR hydrodynamics for Tµν
• Analytically can only study trivial solutions or approximations… full numerical 3D models needed
Solving Einstein’s Equation
PhD Thesis: Numerical Techniques for Solving Einstein’s Equations, Cardiff (1993)M.Alcubierre, GDA, B. Bruegmann, E. Seidel and W. Suen, Towards an understanding of the stability properties
of the 3+1 evolution equations in general relativity, Physical Review D62, 124011, (2000).M. Alcubierre, GDA, C.Bona, D.Fiske, T.Goodale, F.S.Guzman, I.Hawke, S. Hawley, S. Husa, M. Koppitz, C.
Lechner, D. Pollney, D. Rideout, M. Salgado, E. Schnetter, E. Seidel, H. Shinkai, D. Shoemaker, B. Szilagyi, R.Takahashi, J. Winicour, Towards standard testbeds for numerical relativity, Class. Quantum Grav., 21(2), (2004).
8
Single Term (Axisymmetry)
9
• Previous unigrid runs scaledto thousands of processors
• Current runs use FMR(around 6 levels, ~80x80x80nested grids)
• Scale to 64 procs• Take 2 weeks to run• (Boundary conditions, gauge
conditions, initial data, ellipticsolves)
Solving Einstein’sEquations
(Movie from P. Diener, CCT)
10
• Many scientific/engineering components– Physics, Mathematics– Astrophysics, CFD, engineering,...
• Many numerical algorithm components– Finite difference, Finite volume, Spectral methods– Structured or unstructured meshes– Elliptic equations: multigrid, Krylov subspace, ...– Mesh refinement: fixed, adaptive– Multipatch and multimodel
• Many different computational components– Parallelism (HPF, MPI, PVM, ???)– Architecture (MPP, DSM, Vector, PC Clusters, FPGA, ..)– I/O (generate TBs/simulation, checkpointing…)– Visualization
• New technologies– Distributed/Grid computing– High speed networks– Interactive steering, data archives, data mining, tangibles
• Cuts across many disciplines, areas of CS…
My interests:How to define theright abstractionsto bring all thesetogether in aunified scalableframework toenable science?
Key Challenges
11
Computer Science Issues
Research areas driven by numericalrelativity scenarios:
• Highly scalable algorithms– Petascale computing (ORNL, NERSC,
UNM, AEI, NCSA, use our numrelbenchmarks, scenarios)
• Storage & networks• Visualization (Interactive, remote,
AMR)• Software engineering (code
generation, verification & validation,interfaces, interoperability, etc)
• Grid computing (e.g. GridLab driver)
12
• Understanding Gamma Ray Bursts (GRBs)– Rapidly rotating star collapse to BH– Binary NS coalescence– Needed for Grav Wave Astronomy
• Better physics: Fully relativistic, MHD,neutrino, radiation transport.
• Computationally:– High order (>4th) adaptive finite difference
schemes– 16 levels of refinement– Several weeks with1 PFLOP/s sustained
performance– (at least 4 PFLOP/s peak, > 100K procs)– 100 TB memory– PBytes storage for full analysis of output
• Part of NCSA/LSU $200M petascale prop
Parallelism
Parallel/Fast IO,Data Management,
Visualization
Optimization
Interactivemonitoring, steering,visualization, portals
Checkpointing
Petascale Driver
13
Other Application Areas (Funded)• Computational Chemistry
– PI for NSF GridChem: Cyberinfrastructure for Computational Chemistry ($2.7M, LSU, UK,OSC, NCSA, TACC)
– R. Dooley, K. Milfield, C. Guiang, S. Parmidighantum, GDA, From Proposal to Production:Lessons Learned Developing the Computational Chemistry Grid Cyberinfrastructure, Journalof Grid Computing, Jan 2006, Pages 1 -14, 2006
• Petroleum Engineering– PI for DOE/BOR UCOMS Ubiquitous Computing & Monitoring System for Discovery &
Management of Energy Resources ($2.4M, ULL, LSU, SUBR)– Z. Lei, D. Huang, A. Kulshrestha, S. Pena, GDA, X. Li, C. White, R. Duff, J. R. Smith,
Subhash Kalla, ResGrid: A Grid-aware Toolkit For Reservoir Uncertainty Analysis, inproceedings of the Sixth IEEE International Symposium on Cluster Computing and the Grid(CCGrid06), May 16-19, 2006, Singapore, 2006.
• Computational Fluid Dynamic– Co-PI for NSF IGERT IGERT on Multi-scale Computations of Fluid Dynamics ($3.2M, LSU)– Cactus CFD Toolkit
• Biological Computing– Steering committee for NIH INBRE Louisiana Biomedical Research Network (LBRN) ($17M,
LSU + All LA)– Cactus Biology Tookit
14
Other Application Areas (Funded)• Coastal and Environmental Modeling
– PI for NOAA/ONR SURA SCOOP: SURA Coastal Ocean Observing Program (~$7M, SURA,LSU, UAH, TAMU, BIO, UNC, RENCI, VIMS, UFL, Miami, GOMOOS)
– PI for NSF DDDAS: DynaCode: A General DDDAS Framework with Coast and EnvironmentModeling Applications, ($220K, LSU, ND)
– Army Corp of Engineers, Lake Pontchartrain Hurricane Forecast System (UNC, LSU)
• High Speed Networks– PI for NSF CNS EnLIGHTened Computing: Highly-dynamic Grid E-science Applications Driving
Adaptive Optical Control Plane and Compute Resources ($50K, MCNC, UNC, LSU, CISCO,AT&T,
– PI for EU Phosphorus: Lambda User Control led Infrastructure For EuropeanResearch,(3,000,000 Euro, many sites in Europe)
– A. Hutanu, GDA, S. D. Beck, P. Holub, H. Kaiser, A. Kulshrestha, M. Liska, J. MacLaren, L.Matyska, R. Paruchuri, S. Prohaska, E. Seidel, B. Ullmer, S. Venkataraman, Distributed andcollaborative visualization of large data sets using high-speed networks, Future GenerationComputer Systems, Volume 22, Issue 8, p 1004-1010, 2006
• Programming Models– PI for DOE Center for Programming Models for Scalable Parallel Computing ($176,745 to LSU)
• Data– Co-PI for NSF MRI: Development of PetaShare: A Distributed Data Archival, Analysis and
Visualization System for Data Intensive Col laborative Research ($993K)
*
15
Building Infrastructure
16
• To solve complex large scale problems need multidisciplinaryteams and high end environment
• Work at the Max Planck Institute for Gravitational Physics hadoutgrown center
• Governor Foster’s 20/20 initiative to advance Louisianaeconomy through I.T. iniatives– $9M/yr at LSU called LSU CAPITAL– Wrote plan for “CCT” with Seidel, Towns– Arrived August 2003, LSU CAPITAL was then a couple of offices in Frey,
a few staff, supermike, campus expectations– Built everything at the center with Seidel, J. Williams, others.– Recruited many faculty & researchers– Now have over 100 faculty, researchers, staff, students. Competitive for
big federal grants (LBRN, Big Iron, NSF EPSCOR)– Plans closely tied with CS
GDA, E. Seidel, J. Towns, LSU CAPITAL: Immediate Plans, (May, 2003).
Center for Computation &Technology
17 32 faculty from 12 departments
Computer ScienceSitharama IyengarEdward SeidelThomas SterlingGabrielle Allen (Adjunct Physics)Bijaya KarkiTevfik KosarSeung-Jong ParkIan Taylor (Visiting Faculty)Brygg Ullmer
MathematicsSusanne BrennerBurak AksoyluPaul Saylor (Visiting Faculty)
Petroleum EngineeringChris White (Associate Faculty)
Mechanical EngineeringSumanta Acharya (Associate Faculty)
MusicStephen D. Beck
Mass ComunicationsLance Porter
Electrical EngineeringJ. RamanujamDan Katz (Research Faculty)Bingqing WeiTheda Daniels-Race
Oceanography & Coastal ScienceRobert Twilley (Associate Faculty)
PhysicsJoel Tohline (Associate Faculty)Jorge Pullin (Two year)Edward Seidel (Also CS)Manuel TiglioPeter Diener (Research Faculty)
FinanceGary Sanger
ISDSRudy HirschheimEdward WatsonSonja Wiley-Patton
AccountingRon DaigleDavid Hayes
CCT Interdisciplinary Faculty
18
Core Computational Science: Gabrielle Allen
CCT Focus Areas
ScientificComputing(Katz, ECE)
DistributedSystems
(Allen, CS)
ComputationalMathematics
(Brenner, Math)
Visualization(Ullmer, CS)
Coast to Cosmos (C2C): Jorge PullinNumerical Relativity
& Astrophysics(Tohline, Physics)
EngineeringApplications
(White, PetEng)
Coast &Environment
(Twilley, SCE)
Human & Social World (HSW): Steve BeckEmerging
Applications(Beck, Music)
Technology Adoption(Hirshheim, ISDS)
Material World: Jorge PullinMaterials / Chemistry / Biology
(Jha, CCT)
Pro
ject
s, P
rogr
ams,
Initi
ativ
es
Ed Seidel, GDA, Stephen Beck, Rudy Hirschheim, Jorge Pullin, JoelTohline, Joel Williams, CCT Faculty Plan 2006, (2006)
Assistant Director for Computing Applications:G. Allen
19
State initiative ($40M) tosupport research (2004):
40 Gbps optical network + NLRConnects 7 sitesNow around 100 TFs at sitesLIGO/CAMD
New possibilities:Dynamical provisioning and scheduling ofnetwork bandwidthNetwork dependent scenarios“EnLIGHTened Computing (NSF)”
Organizer for LONI Forum (2004)GDA, Charles MacMahon, Ed Seidel, Tom Tierney, LONI
Concept Paper, (2003)GDA, Jarek Nabryski, Ed Seidel, Expression of Intent for a
European Distributed Supercomputer Network, (2002)
Louisiana Optical Network (LONI)
20
International Cooperation• Demonstrated automated simultaneous in-advance
reservation of network bandwidth and computing resourcesbetween US/Japan (GLIF meeting, Tokyo, Sept 2006)
• World’s first inter-domain coordination of resource mangersfor in-advance reservation (CCT HARC Coscheduler)
• Resource managers have different infrastructure and areindependently developed
NSF EnLightenedComputingLSU PIs GDA, Park,Seidel
21
Cactus black holesimulations spawned
apparent horizon findingtasks across the grid.
Supercomputing 2002
Prizes for mostheterogeneous and most
distributed testbed
Global Grid Testbed Collaboration
• 5 continents and over 14 countries.• Around 70 machines, 7500+ processors• Many hardware types, including PS2, IA32, IA64, MIPS,• Many OSs, including Linux, Irix, AIX, OSF, True64, Solaris,
Hitachi• Many organizations: DOE, NSF, MPG, universities, vendors• All ran same Grid infrastructure, and used for different
applications
22
Frameworks for HPCand Grids
23
GDA, T.Goodale and E.Seidel, The Cactus Computational Collaboratory: Enabling Technologies for RelativisticAstrophysics, and a Toolkit for Solving PDEs by Communities in Science and Engineering, Proceedings of 7th
Symposium on the Frontiers of Massively Parallel Computation (Frontiers99), IEEE, New York (1999).
Programming Frameworks
• Complex apps need software development tools• Separate computational physics and from
infrastructure and HPC issues:– Make system, portability, parallelization, I/O,
interpolators, elliptic solvers, check-pointing,optimization, petascale, Grids, new paradigms, …
• Collaboration, enable code sharing and reuse• Community building• Leveraging other fields and tools
24
PITAC Report (2005)
Reports there is a “crisis in software”, the US is at a “tippingpoint” and recommends initatives to:
• create a new generation of well-engineered, scalable,easy-to-use software suitable for computational sciencethat can reduce the complexity and time to solution fortoday’s challenging scientific applications and can createaccurate simulations that answer new questions;
• design, prototype, and evaluate new hardwarearchitectures that can deliver larger fractions of peakhardware performance on scientific applications; and
• focus on sensor- and data-intensive computational scienceapplications in light of the explosive growth of data.
25
26
Lead of the Cactus Code• Freely available, modular, portable and
manageable environment for collaborativelydeveloping parallel, high-performance multi-dimensional simulations (Component-based)
• Developed for Numerical Relativity, but nowgeneral framework for parallel computing (CFD,astro, climate, chem eng, quantum gravity, …)
• Finite difference, AMR, FE/FV, multipatch• Active user and developer communities, main
development now at LSU and AEI.• Science driven design issues• Open source, documentation, etc• Over $10M funding, will be 10 yrs old
Funding from NSF, DOE, DFN, DFG, EU, Microsoft, MPG, LSUT.Goodale, GDA, G.Lanfermann, J.Masso, T.Radke, E.Seidel, J.Shalf, The Cactus Framework and Toolkit: Designand Applications, Vector and Parallel Processing --- VECPAR'2002, 5th International Conference, Springer, (2003).
GDA, T.Goodale, G.Lanfermann, T.Radke, D.Rideout, J.Thornburg, Cactus Users Guide, (2005).
27
Cactus Structure
Core “Flesh”
Plug-In “Thorns”(components)
driverdriver
input/outputinput/output
interpolationinterpolation
SOR solverSOR solver
coordinatescoordinatesboundaryboundary conditionsconditions
black holesblack holes
equations of stateequations of state
remote steeringremote steering
wave evolverswave evolvers
multigridmultigrid
parametersparameters
gridgrid variablesvariables
errorerror handlinghandling
schedulingscheduling
extensibleextensible APIsAPIs
makemake systemsystem
ANSI CANSI CFortran/C/C++Fortran/C/C++
Your Physics !!Your Physics !!
ComputationalComputationalTools !!Tools !!
28
Cactus Flesh
• Written in ANSI C• Independent of all thorns• Contains flexible build system, parameter parsing,
rule based scheduler, …• After initialization acts as utility/service library which
thorns call for information or to request some action(e.g. parameter steering)
• Contains abstracted APIs for:– Parallel operations, IO and checkpointing, reduction
operations, interpolation operations, timers. (APIsdesigned for science needs)
• Functionality provided by (swappable) thorns
29
• Written in C, C++, Fortran 77, Fortran 90, (Java,Perl, Python)
• Separate swappable libraries encapsulatingsome functionality (“implementations”)
• Each thorn contains configuration files whichspecify interface with Flesh and other thorns
• Configuration files into a set of routines whichprovide thorn information– Scheduling, variables, functions, parameters,
configuration• Configuration files have well defined language,
use as basis for framework interoperability
Cactus Thorns (Components)
30
Numerical Methods
Basis for scalable algorithm development !!• Most application codes using Cactus use finite
differences on structured meshes• Parallel driver thorns: Unigrid (PUGH), FMR
(Carpet), AMR (PARAMESH, Grace, SAMRAI),finite volume/element on structured meshes
• Method of lines thorn• Elliptic solver interface (PETSc, SOR, Multigrid,
Trilinos)• Multipatch with Carpet driver• Unstructured mesh support being added.
31
Software ConnectionsWork with developers of• Grid:
– Globus, MPICH-G2, GridLab, GAT, GridSphere,MicroGrid, GRADS, D-Grid
• Solvers:– PETSC, Trillinos
• Driver/AMR:– MPI, Carpet, Grace, SAMRAI, PARAMESH
• I/O:– FlexIO, HDF5, NETCDF, PandaIO
• Viz:– Amira, OpenDX, VisaPult
• Other:– PAPI, Triana, MOAB, Zoltan, Babel
32
Users and Toolkits• Many numerical relativity
groups around the world– Over 100 publications– Maya, Whisky, Lazarus, …
• Others:– CFD– Quantum Gravity– Chemical Engineering– Crack Propagation– Environmental modeling– Plasma physics– Computer science– Astrophysics– Cosmology– [Biology/Materials]
• Toolkits– Cactus Computational
Toolkit– Einstein Toolkit– CFD Toolkit– (Biology Toolkit)
• Teaching– Over 30 student
thesis/diploms
33
• Designed for portability: IPAQ, PS2, Xbox to– Itanium’99, EarthSim, BG/L, …
• Benchmarking for CCT NSF “Big Iron” bid– Cactus now key application for PetaScale
computing– DOE/NSF 4yr time scale, 500K(?) processors– 33K procs on BG/L but now new complex data
structures (AMR)• Drives research issues for perf. prediction,
scalability, hardware design
First Itanium App (1999)
BG/L (2006)
Performance and Optimization
34
Cactus Grid Scenarios• Early Globus experiments• Distributed MPI• Dynamic, adaptive apps …• SC2000: “Cactus Worm” thorn, turned
any Cactus application into intelligentself-migrating creature.
• SC2001: Spawner thorn, any analysismethod can be sent to anotherresource for computation
• SC2002: Cactus TaskFarm, designedfor distributing MPI apps on the Grid
GDA, Tom Goodale, M.Russell, E. Seidel and J. Shalf, Classifying and Enabling Grid Applications, In GridComputing: Making the Global Infrastructure a Reality, Ed: F. Berman, G. Fox, A. J. G. Hey, John Wiley, (2003).GDA, W.Benger, T.Goodale, C.Hege, G.Lanfermann, A.Merzky, T.Radke, E.Seidel and J Shalf, Cactus Tools for
Grid Applications, Cluster Computing, Volume 4, Issue 3, Pages 179-188, (2001)GDA, W.Benger, T. Goodale, H. Hege, G. Lanfermann, A. Merzky, T. Radke, E. Seidel and J. Shalf, The Cactus
Code: A Problem Solving Environment for the Grid, Proceedings of the 9th IEEE International Symposium on HighPerformance Distributed Computing (HPDC9), August 2000, Pittsburgh, IEEE Computer Society, (2000)
35
Cactus Prizes• HPC “Most Stellar” Challenge Award (SC1999)• Gordon Bell Prize for Supercomputing (SC2001)
– Supporting Efficient Execution in HeterogeneousDistributed Computing Environments with Cactusand Globus (Dramlitsch, Allen, Seidel, Foster,Toonen, Karonis, Ripeanu)
• High-performance bandwidth challenge(SC2002) [With Berkeley/NCSA 16.8Gbps]– Highest Performing Application: Wide Area
Distributed Simulations Using Cactus, Globus andVisapult.
• HPC Challenge Awards (SC2002)– Most Geographically Distributed Application and
Most Heterogeneous Set of Platforms.• Other Cactus Related Prizes
– Heinz Billing Prize for Scientific Computing (1998)– IEEE Sidney Fernbach (2006)
36
Current Research Issues
• Algorithms for scalable multiscale, multimodel• Integration with grid middleware, e.g. interaction with
Cactus through web services, dynamic grid scenarios• Framework interoperability and abstraction,
generalized component descriptions.• Supporting multiscale, multimodel scenarios• New applications areas• Remote visualization, networks, tangibles• Methodologies for petascale computing• “XiRel”: New relativity services e.g. metadata, domain
description language, community analysis service,automated optimized code generation
37
SDSC IBM SP1024 procs5x12x17 =1020
NCSA Origin Array256+128+1285x12x(4+2+2) =480
OC-12 line
(But only2.5MB/sec)
GigE:100MB/sec17
12
5
4 2
5
2
Cactus + MPICH-G2Communications dynamically adapt to application and
environmentAny Cactus application. Scaling: 15% -> 85% “Gordon
Bell Prize”(SC01)GDA, T. Dramlitsch, I. Foster, N. Karonis, M. Ripeanu, E. Seidel and B. Toonen, Supporting
efficient execution in heterogeneous distributed computing environments with Cactus andGlobus, Proceedings of Supercomputing 2001, Denver, USA, (2001).
Grids 1: Dynamic, Adaptive,Distributed Computing
38
Grids 2: New Dynamic Scenarios
• Cactus Worm (SC2000)– Many European partners keen to do
something– Cactus simulation starts, launched
from portal– Migrates itself to another site– Registers new location– User tracks/steers, using HTTP,
streaming data, etc…• These EGrid demonstrations led
to GGF and GridLab
GDA, et al, Early Experiences with the Egrid Testbed, IEEE International Symposium on Cluster Computing and theGrid, Brisbane, Australia, May 16-18, pages 130-137, (2001).
GDA, D. Angulo, I. Foster, G. Lanfermann, C. Liu, T. Radke, E. Seidel and J. Shalf, The Cactus Worm: Experimentswith Dynamic Resource Discovery and Allocation in a Grid Environment, International Journal of High Performance
Computing Applications, 15(4), (2001).
39
PI for GridLab
PSNCAEIZIB
MASARYKSZTAKI
ISUFICardiffNTUA
VUChicago
ISIWisconsin
SunCompaq
• EU 5th Framework (2002-2005)• 6 Million Euros• Focused on tools for applications
– Numerical Relativity (Cactus)– Gravitational Wave Data Analysis (Triana)– New dynamic scenarios
• 12 Work Packages including:– Grid services– Applications and Testbed– Grid Application Toolkit (GAT)– Grid Portals– Mobile users
• Reviewed as one of most successful 5th Frameworkprojects --- came directly from Cactus work
E. Seidel, GDA, A. Merzky and J. Nabrzyski, GridLab - A Grid Application Toolkit andTestbed, Future Generation Computer Systems, 18, Issue 8, (2002).
GDA, K.Davis, N.Dolkas, N.D.Doulamis, T.Goodale, T.Kielmann, A.Merzky, J.Nabrzyski,J.Pukacki, T.Radke, Enabling applications on the grid: A Gridlab overview, International
Journal of High Performance Computing Applications, 17, Number 4, (2003)
40
• Abstract programminginterface betweenapplications and Gridservices
• Designed forapplications (move file,run remote task,migrate, write toremote file)
Scp, DRMAA, Condor, SGE, SRB, Curl, RFT.Under Develop
Basic functionality, will work on singleisolated machine (e.g. cp, fork/exec)
DefaultAdaptors
GRMS, Mercury, Delphoi, iGridGridLabAdaptors
Core Globus functionality: GRAM, MDS, GT-RLS, GridFTP
Globus AdaptorsGDA, K. Davis, T. Goodale, A. Hutanu, H. Kaiser, T.
Kielmann, A. Merzky, R. Van Nieuwpoort, A.Reinefeld, F. Schintke, T. Schuett, E. Seidel and B.
Ullmer, The Grid Application Toolkit: TowardsGeneric and Easy Application Programming
Interfaces for the Grid, Proceedings of the IEEE,93(3), pp 534-550, (2005).
Grid Application Toolkit (GAT)
41
• Question: Why are thereso few grid applicationsused routinely?
• Answer:– Lack of a simple, stable
and uniform high-levelprogramming interface thatintegrates the mostcommon grid programmingabstractions
– Need to hide underlyingcomplexities,heterogeneities, andchanges from applications
Programming Grid Applications
42
GAT API Subsystems
43
GAT++ = SAGA• GAT led to Open Grid Forum Research and Working
Groups to develop a “Simple API for Grid Applications”(growing community support)– Focus on scientific and engineering applications– Focus on simplicity
• Similar to GAT– Better thought out API (based on use cases)– Asychronous calls, bulk operations, QoS– More functional areas (e.g. streaming)
• C++ Implementation at LSU
H.Kaiser, A.Merzky, S.Hirmer and GDA, The SAGA C++ Reference Implementation, Proceedings of theWorkshop on Library-Centric Software Design LCSD'06
A.Hutanu, S.Hirmer, GDA, A.Merzky, Analysis of Remote Execution Models for Grid Middleware,Proceedings of 4th International Workshop on Middleware for Grid Computing, MPC 2006.
S.Hirmer, H.Kaiser, A.Merzky, A.Hutanu, GDA, Generic Support for Bulk Operations in Grid Applications,Proceedings of 4th International Workshop on Middleware for Grid Computing, MPC 2006.
44
GridSphere• GridSphere portal framework,
developed in GridLab at AEI– Motivation: Intuitive web-based
interface for numerical relativity(Grid, Viz, shared data, …)
– History: NR Workbench (NCSA1992), ASC Portal (2001),GridSphere (Sept 2003)
• Generic JSR 168 compliant portletcontainer
• Architecture for “pluggable” webapplications
• Core portlets, Grid Portlets, CactusPortlets, D-Grid, LSU projects
• Over 250 subscribed to user maillist, around 500 unique webvisits/month
C.Zhang, C.Dekate, G.Allen, I.Kelley andJ.MacLaren, An Application Portal forCollaborative Coastal Modeling, ConcurrencyComputat.: Pract. Exper., 18, Pages 1-11, (2006)
45
GridSpherePortal
SMSServer
Mail Server
“The Grid”
ReplicaCatalog
User details,notification prefs andsimulation information
IM Server
Notification and InformationMetadataCatalog
46
HTTP
Streaming HDF5Autodownsample
Any Viz Client:LCA Vision, OpenDX
Changing steerableparameters• Parameters• Physics, algorithms• Performance
Remote Interactive Viz & Steering
TiKSL & GriKSL DFN grants at AEI, now EnLIGHTened and other
47
iGrid 2005: Collaborative Viz• Distributed &
collaborativevisualization usingoptical networks
• High-definition videoconferenceconnecting multiplesites
• Central visualizationserver
• Interaction from allparticipating sites
• Data on remotemachines
A.Hutanu, GDA, S.D.Beck, P.Holub, H.Kaiser, A.Kulshrestha, M.Liska, J.MacLaren, L.Matyska, R.Paruchuri,S.Prohaska, E.Seidel, B.Ullmer, S.Venkataraman, Distributed and collaborative visualization of large data
sets using high-speed networks, Future Generation Computer Systems, 22, Issue 8, p 1004-1010, 2006
48
Cactus Portal
Based on GridSphereJSR168 compliantportal framework
Portlets for parameter filepreparing, comparison,managing
Simulation staging,managing, monitoring
Link to data archives, vizetc.
R.Bondarescu, GDA, G.Daues, I.Kelley, M.Russell, E.Seidel, J.Shalf and M.Tobias, The AstrophysicsSimulation Collaboratory Portal: a Framework for Effective Distributed Research, Future Generation
Computer Systems, 21(2), (2005).
49
C.Zhang, C.Dekate, GDA, I.Kelley and J.MacLaren, An Application Portal for Collaborative CoastalModeling, Concurrency Computat.: Pract. Exper., 18, Pages 1-11, (2006) [Best paper GCE05]
Coastal Modeling (SCOOP)
50
C.Zhang, P.Chakraborty, J.Lewis, X.Xu, D.Huang , Z.Lei , GDA, X.Li, C.D.White, A Grid Portal forReservoir Uncertainty Analysis, Submitted to GCE06.
Reservoir Studies (UCoMS)
51
• Adaptive, intelligentsimulation codes able toadapt to environment
• Simulation data stored acrossgeographically distributedspaces– Organization, access, mining
issues– Analysis of federated data sets
by virtual organizations• Data analysis of LIGO, GEO,
LISA signals– Interacting with simulation data– Managing parameter
space/signal analysis
• Domain specific informationand knowledge basedservices:– Gravitational physics
description language• Schema for describing,
searching, encoding simulationresults
• Automated logging ofsimulations: reproducibility
– Notification and data sharingservices to enable collaboration
– Relativity services• Remote servers running e.g.
waveform extraction, horizonfinding etc.
• Connection to publications andinformation
• Automated analysis
XiRel: Next Generation Infrastructurefor Numerical Relativity
52
Coastal & EnvironmentalModeling
53
Louisiana Coastal Area• Rich dynamic environment for
modeling: coupled models, multi-scale, realtime data (sensors,satellites)– Models, Data, Grids for …– Hurricane forecasts– Emergency preparedness– Wetland reconstruction– Ecological studies and fish
populations– Oilspill behaviour– Levee design– Rescue– Hypoxia “Dead Zone”– Algae blooms– 24/7/365 shipping forecasts
GDA and E. Seidel, Application Frameworks for High Performanceand Grid Computing, Scientific Computing, 2006 [Cover Story]
54
Hurricane Forecasting• Starting 5 days from expected landfall, Hurricane advisory from NHC
provides best guess of track and intensity, along with other tracks• RSS & Email for advisory, ftp site for tracks
• Surge/Wave models prepared (spin-up initial conditions with hindcasts oruse existing data from regular forecasts
• Data location, transport, translation, maybe model deployment• Locate wind forcing either from gridded products (MM5, GFDL, …) or
calculate “analytic” wind fields from tracks• Data location
• Run ensemble of models• Data transport, translation,• Resource brokering, application deployment
• Archive model results• Data archive, transport
• Calculate products, e.g. MOM, MOF,MEOW and compare with observations
• Disseminate results• OpenGIS, visualization, notification
• Core Scenario for $12M NSF EPSCOR
55
SURA Coastal OceanObserving Program (SCOOP)
• Integrating data from regional observing systemsfor realtime coastal forecasts in SE
• Coastal modelers working with computerscientists to couple models, provide datasolutions, deploy ensembles of models on theGrid, assemble realtime results with GIStechnologies.
J. MacLaren, GDA, C. Dekate, D. Huang, A. Hutanu and C. Zhang, Shelter from the Storm: Building a Safe Archive ina Hostile World, Lecture Notes in Computer Science Volume 3752, On the Move to Meaningful Internet Systems
2005, Ed: R. Meersman, Z. Tari, P. Herrero, p. 294, 2005.D.Huang, GDA, C.Dekate, H.Kaiser, Z.Lei and J.MacLaren, getdata: A Grid Enabled Data Client for Coastal Modeling,
in the proceeding of High Performance Computing Symposium (HPC 2006), April 3- 6, 2006, Huntsville, AL, 2006.
56
Multiscale Models
P.Bogden, GDA, G.Stone, J.MacLaren, G.Creager, L.Flournoy, W.Zhao, H.Graber, S.Graves, H Conover, R.Luettich,W.Perrie, L.Ramakrishnan, D.Reed, P.Sheng, H.Wang, The SURA Coastal Ocean Observing & Prediction Program
(SCOOP) Service-Oriented Architecture, Proceedings of IEEE/MTS Oceans 2006, Boston, MA, September, 2006.
57
Requirements• Data translation services• Metadata services for models and
data• Provenance information
– Model versions, configurations– Data history– Hindcasts, nowcasts, forecasts
• Visualization & Notification– For different purposes, PR (e.g.
evacuation), scientific insight,levee design
– Realtime• Validation and verification• Scheduling
– Dynamic prioritization based onscenario and resources
– On-demand, co-scheduling,advanced reservation
• Workflow & model-model coupling– Legacy codes vs. framework
• Monitoring of applications• Reliability• Data archive• Clients usable by science
community
58
DDDAS New Capabilities– dynamically invoke more accurate models and algorithms
as hurricane approaches coast,– choose appropriate computing resources for needed
confidence levels– compare model results with observations to feedback into
running simulations– realtime data assimilation– adaptive multi-scale simulations– dynamic component recomposition– simulation needs steer sensors and data inputRequirements for numericallibraries, scheduling, policies etc.
www.dddas.org
C.Douglas, GDA, Y.Efendiev and G.Qin, High PerformanceComputing Issues for Grid Based Dynamic Data-Driven Applications,
Proceedings of DCABES 2006, Hangzhou, P.R. China, 2006
59
DynaCode• Focus on scenarios:
– Hurricane ensemble modeling• Coupling ocean circulation, storm surge,
wave generation models for the Gulf• Notifications from NHC trigger customized
ensemble hurricane models(surge/wind/wave), sensors verify, guidedynamic ensembles
• Event driven, dynamic component frameworkwith algorithm selection, optimization tools,workflow, data assimilation, result validationwith sensor/satellite.
– Ecological restoration and control• Breton Sound diversion, control structure
to allow Mississippi to flow into wetlands• Coupled models (hydrodynamic, salinity,
geomorphic, sediment) control diversion,sensors/wind fields inject real time data.
60
Viz
Submitting today: S.Venkataraman, W.Benger, A.Long, C.Dekate, GDA and S.D.Beck, Visualizing Katrina - Merging ComputerSimulations with Observations, to Proceedings of PARA’06: Workshop on State-of-the-art in Scientific and Parallel Computing, Umea,
Sweden, June 18-21, 2006.
61
Teaching
62
• Setting up research programs• Competitive funding awards• Research Students
– 46 Grad Students, 25 Undergrads from~13 LSU departments.
• Internships 2006• ANL (Dylan Stark, Alex Nagelberg),• LBL (Elena Caraba)
• Awards & Scholarships• LA - STEM, Huel D. Perkins Doctoral
Fellowship, Tiger Scholar Award, 5Chancellor’s List, 3 Dean’s List, …
• Conference Scholarships• 4 to GGF (Boston, Seoul, Tokyo,
Washington), Many to Grid School(Brownsville), 2 Supercomputing(Seattle, Pittsburg), 2 PASI (Argentina),etc,
CCT Students
63
CFD IGERT• Integrative Graduate
Education and ResearchTraineeship
• New interdisciplinary graduateprogram around CFD
• Around 30 faculty and 10departments
• Preparing students for largescale computational researchin CFD areas (new coursesXD-1 and XD-2)
• Cactus CFD ToolkitCo-PI NSF IGERT: IGERT on Multi-scale Computations of FluidDynamics ($3.2M, LSU)
64
Current Students• Chirag Dekate (DDDAS,
DynaCode)• Dayong Huang (Data archives,
SCOOP)• Archit Kulshrestha (SCOOP)• Santiago Pena (UCOMS)• Dylan Stark (XiRel)• Theresa Xu (SCOOP)• Rakesh Yadav (CCT GDP)• Promita Chakraborty (UCOMS)• Andrei Hutanu (Enlightened)• Graduate Committee for 3 IGERT
students (Coastal, EngineeringScience, Physics)
• Undergraduate– Ana Beleu (ECE)– Elena Caraba (Math)– Razvan Carbanesc (CS)– Irina Craciun (Math)– Andrew Davidson (ECE)– John Lewis (CS)– Alex Nagelberg (CS)
Horst Beyer and Irina Craciun, On a newsymmetry of the solutions of the wave equation inthe background of a Kerr black hole, submitted toCommunications in Mathematical Physics.C.Zhang, P.Chakraborty, John Lewis, X.Xu,D.Huang , Z.Lei , GDA, X.Li, C.D.White, A GridPortal for Reservoir Uncertainty Analysis,Submitted to GCE06.
65
• Undergraduate research conference in NorthCarolina
• Paper Presentations– “Implementation of Binary Tree driver in
Cactus” - Jeff DeReus– “On a problem in the stability discussion of
rotating black holes” - Irina Cranciun– “Using a GPU for Feature Extraction from
Turbulent Flow Datasets” - Andrew Davidson– “A Parallel Artificial Neural Network
Implementation” - Ian Wesley Smith– “Integration of Trilinos into the Cactus Code
Framework” - Josh Abadie– “Implementation of Level Set Methods in
Cactus Framework” - Carbunescu RazvanCorneliu
• Poster Presentation– Visualization of Multi Patch Data from
Astrophysics - Elena Caraba
NCUR 2006
66
The End