26
1 Impact of TeraGrid on Science and Engineering Research Ralph Roskies, Scientific Director Pittsburgh Supercomputing Center [email protected] April 6,2009

1 Impact of TeraGrid on Science and Engineering Research Ralph Roskies, Scientific Director Pittsburgh Supercomputing Center [email protected] April 6,2009

Embed Size (px)

Citation preview

Page 1: 1 Impact of TeraGrid on Science and Engineering Research Ralph Roskies, Scientific Director Pittsburgh Supercomputing Center roskies@psc.edu April 6,2009

1

Impact of TeraGrid onScience and Engineering Research

Ralph Roskies,

Scientific Director

Pittsburgh Supercomputing Center

[email protected]

April 6,2009

Page 2: 1 Impact of TeraGrid on Science and Engineering Research Ralph Roskies, Scientific Director Pittsburgh Supercomputing Center roskies@psc.edu April 6,2009

Impacts of TeraGrid on Scientific Fields

• HPC makes some fields possible as we know them- e.g. cosmology, QCD

• HPC adds essential realism to fields like biology, fluid dynamics, materials

science, earthquake and atmospheric science

• HPC is beginning to impact fields like social science and machine learning

• Beyond powerful and diverse hardware

–TeraGrid support enables users to use the hardware effectively

–Development of new algorithms also fuels the progress

Select only a few examples, primarily from annual report, to illustrate

TeraGrid’s impact. There are many more fields, and many more

examples.

2

Page 3: 1 Impact of TeraGrid on Science and Engineering Research Ralph Roskies, Scientific Director Pittsburgh Supercomputing Center roskies@psc.edu April 6,2009

3

Cosmology and Astrophysics

• Three significant figure accuracy predictions of the age of the universe, fraction of dark matter etc.

• Small (1 part in 105) spatial inhomogeneities 380,000 years after the Big Bang, as revealed by COBE and later WMAP Satellite data, get transformed by gravitation into the pattern of severe inhomogeneities (galaxies, stars, voids etc.) that we see today.

• Must use HPC to evolve the universe from that starting point to today, to compare with experiment.

• Is the distribution of galaxies and voids appropriate?

• Does lensing agree with observations.?

Page 4: 1 Impact of TeraGrid on Science and Engineering Research Ralph Roskies, Scientific Director Pittsburgh Supercomputing Center roskies@psc.edu April 6,2009

Kritsuk et al- UCSDTurbulence in Molecular Clouds

• Reported last year on Mike Norman work, which requires

adaptive mesh refinement (AMR) to zoom in on dense regions

to capture the key physical processes- gravitation, shock

heating and radiative cooling of gas. Need large shared

memory capabilities for generating initial conditions, (AMR

very hard to load-balance on distributed memory machines);

then the largest distributed memory machines for the

simulation, visualization.

• Kritsuk et al developed new algorithm (PPML) for the MHD

aspects and compared it to their older version ZEUS, as well

as those of FLASH (Chicago), RAMSES(Saclay)

• Found that turbulence obeys Kolmogoroff scaling even at

Mach 6.

• Long term archival storage for configurations – biggest run

(20483) produced 35 TB of data (at PSC). Much data

movement between sites (17TB to SDSC).

TeraGrid helped make major improvements in the scaling and efficiency of the code (ENZO), and in the visualization tools which are being stressed at these volumes.

4

Page 5: 1 Impact of TeraGrid on Science and Engineering Research Ralph Roskies, Scientific Director Pittsburgh Supercomputing Center roskies@psc.edu April 6,2009

Further astrophysics insights

• FLASH group (Lamb- Chicago)– Used ANL visualization to understand implications how

turbulence wrinkles a combustive flame front (important for understanding explosion of Type 1a supernovae). Found that turbulence behind the flame front is inhomogeneous and non-steady, in contrast to the assumptions made by many theoretical models of turbulent burning.

• Erik Schnetter (LSU)

– Black hole mergers lead to potential gravitational wave signals for LIGO (NSF’s largest single enterprise)

– Enabled by recent algorithmic advances

• Mark Krumholz (UCSC) et al – Appear to have solved long-standing puzzle about formation

of massive stars. Stars form as mass accreted from infalling gas. With spherical geometry, for stars >20 solar masses, outward pressure from photons should halt this infall. Including 2D with rotation, raise limit to about 40 Msun. But stars with masses as high as 120 Msun have been observed. Krumholz shows that 3-D models allow instabilities, which then allows more massive stars to form.

3D simulations much more demanding than 2D. (compare 10003 to 10002)

Only feasible with major compute resources such

as Datastar at SDSC.

5

Page 6: 1 Impact of TeraGrid on Science and Engineering Research Ralph Roskies, Scientific Director Pittsburgh Supercomputing Center roskies@psc.edu April 6,2009

Wide outreach

• Benjamin Brown (U. Colorado & JILA) using a

visualization tool (VAPOR) developed at NCAR in

collaboration with the UC Davis and Ohio State

and TACC’s Ranger to help the Hayden

Planetarium produce a movie about stars.

• The movie, which will reach an estimated one

million people each year, is slated to be released

in 2009.

• The sequences will include simulated “flybys”

through the interior of the Sun, revealing the

dynamos and convection that churn below the

surface.

6

Page 7: 1 Impact of TeraGrid on Science and Engineering Research Ralph Roskies, Scientific Director Pittsburgh Supercomputing Center roskies@psc.edu April 6,2009

Lattice QCD- MILC collaboration

• Improved precision on “standard model”,

required to uncover new physics.  

• Need larger lattices, lighter quarks

• Large allocations

• Frequent algorithmic improvements

• UseTeraGrid resources at NICS, PSC,

NCSA, TACC; DOE resources at Argonne,

NERSC, specialized QCD machine at

Brookhaven, cluster at Fermilab

Will soon store results with The International Lattice Data Grid (ILDG), an international organization which provides standards, services, methods and tools that facilitates the sharing and interchange of lattice QCD gauge configurations among scientific collaborations (US, UK, Japan, Germany, Italy, France, and Australia) http://www.usqcd.org/ildg/

7

Page 8: 1 Impact of TeraGrid on Science and Engineering Research Ralph Roskies, Scientific Director Pittsburgh Supercomputing Center roskies@psc.edu April 6,2009

Gateways-Nanoscale Electronic Structure(nanoHUB, Klimeck, Purdue)

• Challenge of designing microprocessors and other devices with nanoscale components.

• Group is creating new content for simulation tools, tutorials, and additional educational material. Gateway enables on-line simulation through a web browser without the installation of any software.

• Nanowire tool allows exploration of nanowires in circuits e.g. impact of fluctuations on robustness of circuit.

• nanoHUB.org hosts more than 90 tools, had >6200 users, ran>300,000 simulations, supported 44 classes, in 2008. (TG: parallel jobs, 83 users, 1100 jobs- shows how TG complements private resources depending on user needs).

• Largest codes operate at the petascale (NEMO-3D, OMEN), using 32,768 cores of Ranger, 65,536 cores of Kraken with excellent scaling.

• Communities develop the Gateways- TG helps interface that to TG resources.

TG contributions •Improved security for Gateways; •Helped improve reliability of grid-based

job submission•Will benefit from improvedmetascheduling capabilities•Uses resources at NCSA, PSC, IU, ORNL and Purdue

nanowire tool

8

Page 9: 1 Impact of TeraGrid on Science and Engineering Research Ralph Roskies, Scientific Director Pittsburgh Supercomputing Center roskies@psc.edu April 6,2009

Geosciences (SCEC)

• Goal is understanding earthquakes and to mitigate

risks of loss of life and property damage.

• Spans the gamut from largest simulations to midsize

jobs to huge number of small jobs. (Last year talked

about largest simulations).

• For largest runs (Cybershake), where they examine

high frequency modes (short wave-length, so higher

resolution) of particular interest to civil engineers, need

large distributed memory runs using the Track2

machines at TACC, NICS. 2000-64,000 cores of

Ranger, Kraken.

• To improve the velocity model that goes into the large

simulations, need mid-range core counts jobs doing full

3-D tomography (Tera3D); DTF and other clusters (e.g.

Abe); Need large data available on disk (100 TB)

Output is large data sets stored at NCSA, or SDSC’s GPFS, IRODS.

Moving to DOE machine at Argonne. TG provided help with

essential data transfer.

Output is large data sets stored at NCSA, or SDSC’s GPFS, IRODS.

Moving to DOE machine at Argonne. TG provided help with

essential data transfer.

Excellent example of coordinated ASTA support- CUI (SDSC) and

Urbanic (PSC) interface with consultants at NICS, TACC, &NCSA

to smooth migration of code. Improved performance 4x.

Excellent example of coordinated ASTA support- CUI (SDSC) and

Urbanic (PSC) interface with consultants at NICS, TACC, &NCSA

to smooth migration of code. Improved performance 4x.

9

Page 10: 1 Impact of TeraGrid on Science and Engineering Research Ralph Roskies, Scientific Director Pittsburgh Supercomputing Center roskies@psc.edu April 6,2009

SCEC-PSHA• Using the large scale simulation data, estimate

probablistic seismic hazard (PSHA) curves for

sites in southern California (probability that

ground motion will exceed some threshold over a

given time period).

• Used by hospitals, power plants etc. as part of

their risk assessment.

• Plan to replace existing phenomenological curves

with more accurate results using new

CyberShake code. (better directivity, basin

amplification)

• For each location, need a Cybershake run

followed by roughly 840,000 parallel short jobs

(420,000 rupture forecasts, 420,000 extraction of

peak ground motion).

• Completed 40 locations to date, targeting 200 in

2009, and 2000 in 2010.

Managing these requires effective grid workflow tools for job submission, data management and error recovery, using Pegasus (ISI) and Dagman (U of Wisconsin Condor group).

Managing these requires effective grid workflow tools for job submission, data management and error recovery, using Pegasus (ISI) and Dagman (U of Wisconsin Condor group).

10

Page 11: 1 Impact of TeraGrid on Science and Engineering Research Ralph Roskies, Scientific Director Pittsburgh Supercomputing Center roskies@psc.edu April 6,2009

CFD and Medicine (arterial flow)George Karniadakis- Brown

• Strong relationship between blood flow

pattern and formation of arterial disease

such as atherosclerotic plaques

• Disease develops preferentially in

separated and re-circulating flow

regions such as vessel bifurcations

• 1D results feed 3D simulations,

providing flow rate and pressure for

boundary conditions

• Very clever multiscale approach

• Couples resources weakly in real time,

but requires co-scheduling

• MPIg, partly supported by TeraGrid,

used for intra-site and inter-site

communications.

1D model

3d simulation

1d data 1d data

1d data

3d simulation

3d simulation

3d simulation

3d simulation

1d data

1d d

ata

11

Page 12: 1 Impact of TeraGrid on Science and Engineering Research Ralph Roskies, Scientific Director Pittsburgh Supercomputing Center roskies@psc.edu April 6,2009

Medical Impact

• Today, being used for validation

and quantification of some of the

pathologies

• With realistic geometries, part of

promise of patient specific

treatment

12

Page 13: 1 Impact of TeraGrid on Science and Engineering Research Ralph Roskies, Scientific Director Pittsburgh Supercomputing Center roskies@psc.edu April 6,2009

Biological Science

• Huge impact of TeraGrid

• Primarily large-scale molecular dynamics (MD) simulations

(classical Newton laws) that elucidate how structure leads to

function.

• Major effort in scaling codes (e.g. AMBER, CHARMM, NAMD) to

large distributed memory computers- very fruitful interaction

between applications scientists and computer scientists (e.g.

Schulten and Kale)

• When breaking chemical bonds, need quantum mechanical

methods (QM/MM), often best done on large shared- memory

systems

• Generate very large datasets, so data analysis now becoming a

serious concern. Considerable discussion of developing a repository

of MD biological simulations, but no agreements yet on formats.13

Page 14: 1 Impact of TeraGrid on Science and Engineering Research Ralph Roskies, Scientific Director Pittsburgh Supercomputing Center roskies@psc.edu April 6,2009

Aquaporins - Schulten group,UIUC

• Aquaporins are proteins which conduct large

volumes of water through cell walls while filtering

out charged particles like hydrogen ions

(protons).

• Start with known crystal structure, simulate over

100,000 atoms, using NAMD

• Water moves through aquaporin channels in

single file. Oxygen leads the way in. At the most

constricted point of channel, water molecule flips.

Protons can’t do this.

14

Page 15: 1 Impact of TeraGrid on Science and Engineering Research Ralph Roskies, Scientific Director Pittsburgh Supercomputing Center roskies@psc.edu April 6,2009

Aquaporin Mechanism

Animation pointed to by 2003 Nobel chemistry prize announcement for structure of aquaporins (Peter Agre)

The simulation helped explain how the structure led to the function

15

Page 16: 1 Impact of TeraGrid on Science and Engineering Research Ralph Roskies, Scientific Director Pittsburgh Supercomputing Center roskies@psc.edu April 6,2009

Actin-Arp2/3 branch junction

Greg Voth-Utah

Actin-Arp2/3 branch junction

– key piece of the cellular cytoskeleton, helping to

confer shape and structure to most types of cells.

– cannot be crystallized to obtain high-resolution

structures.

– working with leading experimental groups, MD

simulations are helping to refine the structure of

the branch junction.

– 3M atoms, linear scaling to 4000 processors on

Kraken,

The all-atom molecular dynamics simulations

form the basis for developing new coarse-

grained models of the branch junction to model

larger scale systems. 16

Page 17: 1 Impact of TeraGrid on Science and Engineering Research Ralph Roskies, Scientific Director Pittsburgh Supercomputing Center roskies@psc.edu April 6,2009

HIV-1 Protease Inhibition:

• HIV-1 protease

– Essential enzyme in life cycle of HIV-1

– Popular drug target for antiviral therapy

– It has been hypothesized that mutations outside the

active site affect the mobility of 2 gate-keeper

molecular flaps near the active site, and this affects

inhibitor binding

Tagged two sites on flaps, and used

electron paramagnetic resonance

measurements to measure distance

between them; excellent agreement with

MD simulations,

Provides a molecular view of how

mutations affect the conformation.

Wild type- black & 2 mutants

Gail Fanucci (U. Florida)& Carlos Simmerling (Stony Brook)

17

Page 18: 1 Impact of TeraGrid on Science and Engineering Research Ralph Roskies, Scientific Director Pittsburgh Supercomputing Center roskies@psc.edu April 6,2009

Similar slides for

• Schulten, UIUC The Molecular Basis of Clotting

• McCammon, UCSD Virtual Screening Led to Real Progress for

African Sleeping Sickness

• Baik, Indiana U. Investigating Alzheimer’s (copper binding

behavior)

18

Page 19: 1 Impact of TeraGrid on Science and Engineering Research Ralph Roskies, Scientific Director Pittsburgh Supercomputing Center roskies@psc.edu April 6,2009

Mechanical EngineeringNumerical Studies of Primary Breakup of Liquid Jets in Stationary and Moving Gases

Madhusudan Pai & Heinz Pitsch, Stanford; Olivier Desjardins, Colorado

• Liquid jet breakup in automobile internal combustion engines and aircraft gas turbine combustors controls fuel consumption and formation of engine pollutants.

• Immense economic and environmental significance• Predicting the drop size distribution that results from liquid jet

breakup is an important unsolved problem • Current simulations (liquid Weber number ~3000, and Reynolds

number ~5000), requires upwards of 260M cells and typically about 2048 processors for detailed simulations.

• Physically more realistic simulations will require liquid Weber and Reynolds numbers 10x higher (1000x in computational complexity).

• Used Queen Bee (LONI) for code development and scaling, Ranger for production. Highly accurate direct numerical simulation (DNS) to develop parameters that will be used in larger scale studies (LES) of engines.

DNS of a diesel jet (left) and a liquid jet in crossflow (right)

19

Page 20: 1 Impact of TeraGrid on Science and Engineering Research Ralph Roskies, Scientific Director Pittsburgh Supercomputing Center roskies@psc.edu April 6,2009

Materials Science

• How can spider silk be as strong as steel if the “glue” of hydrogen bonds holding it together is 1000x weaker than steel’s metallic bonds?

• Depends on the specific configuration of structural proteins as well as the hydrogen bonds.

• Used SDSC’s IBM Blue Gene (6144 processors) to simulate how spider silk compounds react at the atomic level to structural stresses.

• Discovered what governs the rupture strength of H-bond assemblies, confirmed by direct large-scale full-atomistic MD simulation studies of beta-sheet structures in explicit solvent

• This could help engineers create new materials that mimic spider silk’s lightweight robustness. Could also impact research on muscle tissue and amyloid fibers found in the brain which have similar beta-sheets structures, composed of hierarchical assemblies of H-bonds

Spider Silk’s Strength (Markus Buehler, MIT)

Ross Walker (SDSC)• implemented a needed parallel

version of some serial restraint codes in NAMD and LAMMPS, for efficient implementation on BG.

• Advised on how to distribute calculations across the BG

• Helped with visualization.

20

Page 21: 1 Impact of TeraGrid on Science and Engineering Research Ralph Roskies, Scientific Director Pittsburgh Supercomputing Center roskies@psc.edu April 6,2009

SIDGrid- Social Science Gateway

• SIDGRID provides access to “multimodal” data - streaming data that change over time. e.g. as human subject views a video, heart rate, eye movement, and a video of the subject’s facial expressions are captured. Data are collected many times per second, sometimes at different timescales, and synchronized for analysis, resulting in large datasets.

• The Gateway provides sophisticated analysis tools to study these datasets

• SIDGrid uses TeraGrid resources for computationally intensive tasks including media transcoding (decoding and encoding between compression formats), pitch analysis of audio tracks, and functional Magnetic Resonance Imaging (fMRI) image analysis.

• A new application framework has been developed to enable users to easily deploy new social science applications in the SIDGrid portal. SIDGrid launches thousands of jobs in a week.

• Gateway cited in publications in analysis of neuroimaging data, and in computational linguistics.

• Opening possibilities to community, by watching the pioneers.

TeraGrid staff will incorporate metascheduling capabilities, improve security models for

community accounts, incorporate data-sharing capabilities, and upgrade

workflow tools

Rick Stevens et al, Argonne

21

Page 22: 1 Impact of TeraGrid on Science and Engineering Research Ralph Roskies, Scientific Director Pittsburgh Supercomputing Center roskies@psc.edu April 6,2009

New communities- Machine Learning

Tuomas Sandholm, CMU, Poker

– Poker is a game with imperfect knowledge

– developing what appears to be the best

computer poker capability

– needs large shared memory

Rob Farber & Harold Trease, PNNL Facial Recognition

• import, interpret, database millions of images per second

• Essentially realtime facial recognition

• near-linear scaling across 60,000 cores (Ranger)

22

Page 23: 1 Impact of TeraGrid on Science and Engineering Research Ralph Roskies, Scientific Director Pittsburgh Supercomputing Center roskies@psc.edu April 6,2009

New communitiesVirtual Pharmacy Clean Room Environment

Steve Abel, Steve Dunlop, Purdue University

• Created a realistic, immersive 3-D virtual pharmacy clean room for training pharmacy students, pharmacists and pharmacy technicians

• Enables evaluation of clean room design and work flow by engineering researchers.

• The 3-D model can be employed in multi-walled virtual environments. Eventual incorporation of force-feedback and haptic (touch and feel) technologies

• 160 students used the room in 2008; almost unanimously, report the experience has given them a better understanding of, and made them more comfortable with, the clean room environment and procedures.

TG Purdue staff helped the team in using TG’s distributed

rendering service TeraDRE to render elements of the virtual clean room, including a fly-

through movie in less than 48 hours, (would take five months

on a single computer).

23

Page 24: 1 Impact of TeraGrid on Science and Engineering Research Ralph Roskies, Scientific Director Pittsburgh Supercomputing Center roskies@psc.edu April 6,2009

Data-Driven Hurricane prediction

Fuqing Zhang, Penn State, with NOAA,

Texas A&M collaborators

• Tracked hurricanes Ike and Gustav in

real-time

• Used ensemble forecasting, and 40,000

cores of Ranger to update predictions.

• First time use of data streamed directly

from NOAA planes inside the storm.

24

Page 25: 1 Impact of TeraGrid on Science and Engineering Research Ralph Roskies, Scientific Director Pittsburgh Supercomputing Center roskies@psc.edu April 6,2009

Impacts of TeraGrid on Scientific Fields

• HPC makes some fields possible as we know them- e.g.

cosmology, QCD

• HPC adds essential realism to fields like biology, fluid dynamics,

materials science, earthquake and atmospheric science

• HPC is beginning to impact fields like social science and machine

learning

• TeraGrid allows increased modeling complexity- larger systems,

more realism; projects increasinglyuse resources at multiple sites,

using more of the repertoire of TeraGrid services

• Beyond powerful and diverse hardware, TeraGrid support enables

users to use the hardware effectively

25

Page 26: 1 Impact of TeraGrid on Science and Engineering Research Ralph Roskies, Scientific Director Pittsburgh Supercomputing Center roskies@psc.edu April 6,2009

Transforming How We Do Science

• TeraGrid coordination among sites, making the necessarily

heterogeneous resources into one system, leads to much higher

researcher productivity.

• Faster turnaround leads to greater researcher productivity and

changes the questions we ask in all disciplines.

• Visualization aids understanding

• Gateways open the field to many more researchers

26