Upload
erin-miles
View
218
Download
0
Tags:
Embed Size (px)
Citation preview
1
Impact of TeraGrid onScience and Engineering Research
Ralph Roskies,
Scientific Director
Pittsburgh Supercomputing Center
April 6,2009
Impacts of TeraGrid on Scientific Fields
• HPC makes some fields possible as we know them- e.g. cosmology, QCD
• HPC adds essential realism to fields like biology, fluid dynamics, materials
science, earthquake and atmospheric science
• HPC is beginning to impact fields like social science and machine learning
• Beyond powerful and diverse hardware
–TeraGrid support enables users to use the hardware effectively
–Development of new algorithms also fuels the progress
Select only a few examples, primarily from annual report, to illustrate
TeraGrid’s impact. There are many more fields, and many more
examples.
2
3
Cosmology and Astrophysics
• Three significant figure accuracy predictions of the age of the universe, fraction of dark matter etc.
• Small (1 part in 105) spatial inhomogeneities 380,000 years after the Big Bang, as revealed by COBE and later WMAP Satellite data, get transformed by gravitation into the pattern of severe inhomogeneities (galaxies, stars, voids etc.) that we see today.
• Must use HPC to evolve the universe from that starting point to today, to compare with experiment.
• Is the distribution of galaxies and voids appropriate?
• Does lensing agree with observations.?
Kritsuk et al- UCSDTurbulence in Molecular Clouds
• Reported last year on Mike Norman work, which requires
adaptive mesh refinement (AMR) to zoom in on dense regions
to capture the key physical processes- gravitation, shock
heating and radiative cooling of gas. Need large shared
memory capabilities for generating initial conditions, (AMR
very hard to load-balance on distributed memory machines);
then the largest distributed memory machines for the
simulation, visualization.
• Kritsuk et al developed new algorithm (PPML) for the MHD
aspects and compared it to their older version ZEUS, as well
as those of FLASH (Chicago), RAMSES(Saclay)
• Found that turbulence obeys Kolmogoroff scaling even at
Mach 6.
• Long term archival storage for configurations – biggest run
(20483) produced 35 TB of data (at PSC). Much data
movement between sites (17TB to SDSC).
TeraGrid helped make major improvements in the scaling and efficiency of the code (ENZO), and in the visualization tools which are being stressed at these volumes.
4
Further astrophysics insights
• FLASH group (Lamb- Chicago)– Used ANL visualization to understand implications how
turbulence wrinkles a combustive flame front (important for understanding explosion of Type 1a supernovae). Found that turbulence behind the flame front is inhomogeneous and non-steady, in contrast to the assumptions made by many theoretical models of turbulent burning.
• Erik Schnetter (LSU)
– Black hole mergers lead to potential gravitational wave signals for LIGO (NSF’s largest single enterprise)
– Enabled by recent algorithmic advances
• Mark Krumholz (UCSC) et al – Appear to have solved long-standing puzzle about formation
of massive stars. Stars form as mass accreted from infalling gas. With spherical geometry, for stars >20 solar masses, outward pressure from photons should halt this infall. Including 2D with rotation, raise limit to about 40 Msun. But stars with masses as high as 120 Msun have been observed. Krumholz shows that 3-D models allow instabilities, which then allows more massive stars to form.
3D simulations much more demanding than 2D. (compare 10003 to 10002)
Only feasible with major compute resources such
as Datastar at SDSC.
5
Wide outreach
• Benjamin Brown (U. Colorado & JILA) using a
visualization tool (VAPOR) developed at NCAR in
collaboration with the UC Davis and Ohio State
and TACC’s Ranger to help the Hayden
Planetarium produce a movie about stars.
• The movie, which will reach an estimated one
million people each year, is slated to be released
in 2009.
• The sequences will include simulated “flybys”
through the interior of the Sun, revealing the
dynamos and convection that churn below the
surface.
6
Lattice QCD- MILC collaboration
• Improved precision on “standard model”,
required to uncover new physics.
• Need larger lattices, lighter quarks
• Large allocations
• Frequent algorithmic improvements
• UseTeraGrid resources at NICS, PSC,
NCSA, TACC; DOE resources at Argonne,
NERSC, specialized QCD machine at
Brookhaven, cluster at Fermilab
Will soon store results with The International Lattice Data Grid (ILDG), an international organization which provides standards, services, methods and tools that facilitates the sharing and interchange of lattice QCD gauge configurations among scientific collaborations (US, UK, Japan, Germany, Italy, France, and Australia) http://www.usqcd.org/ildg/
7
Gateways-Nanoscale Electronic Structure(nanoHUB, Klimeck, Purdue)
• Challenge of designing microprocessors and other devices with nanoscale components.
• Group is creating new content for simulation tools, tutorials, and additional educational material. Gateway enables on-line simulation through a web browser without the installation of any software.
• Nanowire tool allows exploration of nanowires in circuits e.g. impact of fluctuations on robustness of circuit.
• nanoHUB.org hosts more than 90 tools, had >6200 users, ran>300,000 simulations, supported 44 classes, in 2008. (TG: parallel jobs, 83 users, 1100 jobs- shows how TG complements private resources depending on user needs).
• Largest codes operate at the petascale (NEMO-3D, OMEN), using 32,768 cores of Ranger, 65,536 cores of Kraken with excellent scaling.
• Communities develop the Gateways- TG helps interface that to TG resources.
TG contributions •Improved security for Gateways; •Helped improve reliability of grid-based
job submission•Will benefit from improvedmetascheduling capabilities•Uses resources at NCSA, PSC, IU, ORNL and Purdue
nanowire tool
8
Geosciences (SCEC)
• Goal is understanding earthquakes and to mitigate
risks of loss of life and property damage.
• Spans the gamut from largest simulations to midsize
jobs to huge number of small jobs. (Last year talked
about largest simulations).
• For largest runs (Cybershake), where they examine
high frequency modes (short wave-length, so higher
resolution) of particular interest to civil engineers, need
large distributed memory runs using the Track2
machines at TACC, NICS. 2000-64,000 cores of
Ranger, Kraken.
• To improve the velocity model that goes into the large
simulations, need mid-range core counts jobs doing full
3-D tomography (Tera3D); DTF and other clusters (e.g.
Abe); Need large data available on disk (100 TB)
Output is large data sets stored at NCSA, or SDSC’s GPFS, IRODS.
Moving to DOE machine at Argonne. TG provided help with
essential data transfer.
Output is large data sets stored at NCSA, or SDSC’s GPFS, IRODS.
Moving to DOE machine at Argonne. TG provided help with
essential data transfer.
Excellent example of coordinated ASTA support- CUI (SDSC) and
Urbanic (PSC) interface with consultants at NICS, TACC, &NCSA
to smooth migration of code. Improved performance 4x.
Excellent example of coordinated ASTA support- CUI (SDSC) and
Urbanic (PSC) interface with consultants at NICS, TACC, &NCSA
to smooth migration of code. Improved performance 4x.
9
SCEC-PSHA• Using the large scale simulation data, estimate
probablistic seismic hazard (PSHA) curves for
sites in southern California (probability that
ground motion will exceed some threshold over a
given time period).
• Used by hospitals, power plants etc. as part of
their risk assessment.
• Plan to replace existing phenomenological curves
with more accurate results using new
CyberShake code. (better directivity, basin
amplification)
• For each location, need a Cybershake run
followed by roughly 840,000 parallel short jobs
(420,000 rupture forecasts, 420,000 extraction of
peak ground motion).
• Completed 40 locations to date, targeting 200 in
2009, and 2000 in 2010.
Managing these requires effective grid workflow tools for job submission, data management and error recovery, using Pegasus (ISI) and Dagman (U of Wisconsin Condor group).
Managing these requires effective grid workflow tools for job submission, data management and error recovery, using Pegasus (ISI) and Dagman (U of Wisconsin Condor group).
10
CFD and Medicine (arterial flow)George Karniadakis- Brown
• Strong relationship between blood flow
pattern and formation of arterial disease
such as atherosclerotic plaques
• Disease develops preferentially in
separated and re-circulating flow
regions such as vessel bifurcations
• 1D results feed 3D simulations,
providing flow rate and pressure for
boundary conditions
• Very clever multiscale approach
• Couples resources weakly in real time,
but requires co-scheduling
• MPIg, partly supported by TeraGrid,
used for intra-site and inter-site
communications.
1D model
3d simulation
1d data 1d data
1d data
3d simulation
3d simulation
3d simulation
3d simulation
1d data
1d d
ata
11
Medical Impact
• Today, being used for validation
and quantification of some of the
pathologies
• With realistic geometries, part of
promise of patient specific
treatment
12
Biological Science
• Huge impact of TeraGrid
• Primarily large-scale molecular dynamics (MD) simulations
(classical Newton laws) that elucidate how structure leads to
function.
• Major effort in scaling codes (e.g. AMBER, CHARMM, NAMD) to
large distributed memory computers- very fruitful interaction
between applications scientists and computer scientists (e.g.
Schulten and Kale)
• When breaking chemical bonds, need quantum mechanical
methods (QM/MM), often best done on large shared- memory
systems
• Generate very large datasets, so data analysis now becoming a
serious concern. Considerable discussion of developing a repository
of MD biological simulations, but no agreements yet on formats.13
Aquaporins - Schulten group,UIUC
• Aquaporins are proteins which conduct large
volumes of water through cell walls while filtering
out charged particles like hydrogen ions
(protons).
• Start with known crystal structure, simulate over
100,000 atoms, using NAMD
• Water moves through aquaporin channels in
single file. Oxygen leads the way in. At the most
constricted point of channel, water molecule flips.
Protons can’t do this.
14
Aquaporin Mechanism
Animation pointed to by 2003 Nobel chemistry prize announcement for structure of aquaporins (Peter Agre)
The simulation helped explain how the structure led to the function
15
Actin-Arp2/3 branch junction
Greg Voth-Utah
Actin-Arp2/3 branch junction
– key piece of the cellular cytoskeleton, helping to
confer shape and structure to most types of cells.
– cannot be crystallized to obtain high-resolution
structures.
– working with leading experimental groups, MD
simulations are helping to refine the structure of
the branch junction.
– 3M atoms, linear scaling to 4000 processors on
Kraken,
The all-atom molecular dynamics simulations
form the basis for developing new coarse-
grained models of the branch junction to model
larger scale systems. 16
HIV-1 Protease Inhibition:
• HIV-1 protease
– Essential enzyme in life cycle of HIV-1
– Popular drug target for antiviral therapy
– It has been hypothesized that mutations outside the
active site affect the mobility of 2 gate-keeper
molecular flaps near the active site, and this affects
inhibitor binding
Tagged two sites on flaps, and used
electron paramagnetic resonance
measurements to measure distance
between them; excellent agreement with
MD simulations,
Provides a molecular view of how
mutations affect the conformation.
Wild type- black & 2 mutants
Gail Fanucci (U. Florida)& Carlos Simmerling (Stony Brook)
17
Similar slides for
• Schulten, UIUC The Molecular Basis of Clotting
• McCammon, UCSD Virtual Screening Led to Real Progress for
African Sleeping Sickness
• Baik, Indiana U. Investigating Alzheimer’s (copper binding
behavior)
18
Mechanical EngineeringNumerical Studies of Primary Breakup of Liquid Jets in Stationary and Moving Gases
Madhusudan Pai & Heinz Pitsch, Stanford; Olivier Desjardins, Colorado
• Liquid jet breakup in automobile internal combustion engines and aircraft gas turbine combustors controls fuel consumption and formation of engine pollutants.
• Immense economic and environmental significance• Predicting the drop size distribution that results from liquid jet
breakup is an important unsolved problem • Current simulations (liquid Weber number ~3000, and Reynolds
number ~5000), requires upwards of 260M cells and typically about 2048 processors for detailed simulations.
• Physically more realistic simulations will require liquid Weber and Reynolds numbers 10x higher (1000x in computational complexity).
• Used Queen Bee (LONI) for code development and scaling, Ranger for production. Highly accurate direct numerical simulation (DNS) to develop parameters that will be used in larger scale studies (LES) of engines.
DNS of a diesel jet (left) and a liquid jet in crossflow (right)
19
Materials Science
• How can spider silk be as strong as steel if the “glue” of hydrogen bonds holding it together is 1000x weaker than steel’s metallic bonds?
• Depends on the specific configuration of structural proteins as well as the hydrogen bonds.
• Used SDSC’s IBM Blue Gene (6144 processors) to simulate how spider silk compounds react at the atomic level to structural stresses.
• Discovered what governs the rupture strength of H-bond assemblies, confirmed by direct large-scale full-atomistic MD simulation studies of beta-sheet structures in explicit solvent
• This could help engineers create new materials that mimic spider silk’s lightweight robustness. Could also impact research on muscle tissue and amyloid fibers found in the brain which have similar beta-sheets structures, composed of hierarchical assemblies of H-bonds
Spider Silk’s Strength (Markus Buehler, MIT)
Ross Walker (SDSC)• implemented a needed parallel
version of some serial restraint codes in NAMD and LAMMPS, for efficient implementation on BG.
• Advised on how to distribute calculations across the BG
• Helped with visualization.
20
SIDGrid- Social Science Gateway
• SIDGRID provides access to “multimodal” data - streaming data that change over time. e.g. as human subject views a video, heart rate, eye movement, and a video of the subject’s facial expressions are captured. Data are collected many times per second, sometimes at different timescales, and synchronized for analysis, resulting in large datasets.
• The Gateway provides sophisticated analysis tools to study these datasets
• SIDGrid uses TeraGrid resources for computationally intensive tasks including media transcoding (decoding and encoding between compression formats), pitch analysis of audio tracks, and functional Magnetic Resonance Imaging (fMRI) image analysis.
• A new application framework has been developed to enable users to easily deploy new social science applications in the SIDGrid portal. SIDGrid launches thousands of jobs in a week.
• Gateway cited in publications in analysis of neuroimaging data, and in computational linguistics.
• Opening possibilities to community, by watching the pioneers.
TeraGrid staff will incorporate metascheduling capabilities, improve security models for
community accounts, incorporate data-sharing capabilities, and upgrade
workflow tools
Rick Stevens et al, Argonne
21
New communities- Machine Learning
Tuomas Sandholm, CMU, Poker
– Poker is a game with imperfect knowledge
– developing what appears to be the best
computer poker capability
– needs large shared memory
Rob Farber & Harold Trease, PNNL Facial Recognition
• import, interpret, database millions of images per second
• Essentially realtime facial recognition
• near-linear scaling across 60,000 cores (Ranger)
22
New communitiesVirtual Pharmacy Clean Room Environment
Steve Abel, Steve Dunlop, Purdue University
• Created a realistic, immersive 3-D virtual pharmacy clean room for training pharmacy students, pharmacists and pharmacy technicians
• Enables evaluation of clean room design and work flow by engineering researchers.
• The 3-D model can be employed in multi-walled virtual environments. Eventual incorporation of force-feedback and haptic (touch and feel) technologies
• 160 students used the room in 2008; almost unanimously, report the experience has given them a better understanding of, and made them more comfortable with, the clean room environment and procedures.
TG Purdue staff helped the team in using TG’s distributed
rendering service TeraDRE to render elements of the virtual clean room, including a fly-
through movie in less than 48 hours, (would take five months
on a single computer).
23
Data-Driven Hurricane prediction
Fuqing Zhang, Penn State, with NOAA,
Texas A&M collaborators
• Tracked hurricanes Ike and Gustav in
real-time
• Used ensemble forecasting, and 40,000
cores of Ranger to update predictions.
• First time use of data streamed directly
from NOAA planes inside the storm.
24
Impacts of TeraGrid on Scientific Fields
• HPC makes some fields possible as we know them- e.g.
cosmology, QCD
• HPC adds essential realism to fields like biology, fluid dynamics,
materials science, earthquake and atmospheric science
• HPC is beginning to impact fields like social science and machine
learning
• TeraGrid allows increased modeling complexity- larger systems,
more realism; projects increasinglyuse resources at multiple sites,
using more of the repertoire of TeraGrid services
• Beyond powerful and diverse hardware, TeraGrid support enables
users to use the hardware effectively
25
Transforming How We Do Science
• TeraGrid coordination among sites, making the necessarily
heterogeneous resources into one system, leads to much higher
researcher productivity.
• Faster turnaround leads to greater researcher productivity and
changes the questions we ask in all disciplines.
• Visualization aids understanding
• Gateways open the field to many more researchers
26