29
LCSE & PSC Demonstration of Exploratory Science through Interactively Driven Supercomputer Simualtions David Porter, Mike Knox, Jim Greensky, James Hansen, Paul Woodward Laboratory for Computational Science & Engineering University of Minnesota Raghu Reddy & Nathan Stone Pittsburgh Supercomputing Center

LCSE & PSC Demonstration of Exploratory Science through Interactively Driven Supercomputer Simualtions David Porter, Mike Knox, Jim Greensky, James Hansen,

Embed Size (px)

Citation preview

Page 1: LCSE & PSC Demonstration of Exploratory Science through Interactively Driven Supercomputer Simualtions David Porter, Mike Knox, Jim Greensky, James Hansen,

LCSE & PSCDemonstration of Exploratory

Sciencethrough Interactively DrivenSupercomputer Simualtions

David Porter, Mike Knox, Jim Greensky, James Hansen, Paul Woodward

Laboratory for Computational Science & EngineeringUniversity of MinnesotaRaghu Reddy & Nathan Stone

Pittsburgh Supercomputing Center

Page 2: LCSE & PSC Demonstration of Exploratory Science through Interactively Driven Supercomputer Simualtions David Porter, Mike Knox, Jim Greensky, James Hansen,

The Goal is Scientific Exploration:

Scientific productivity requires the ability to rapidly answer “What if?”

Batch processing on today’s supercomputers gives answers only in weeks or months.

Exploratory runs are now done mostly on small local resources, but these also take weeks or months.

Page 3: LCSE & PSC Demonstration of Exploratory Science through Interactively Driven Supercomputer Simualtions David Porter, Mike Knox, Jim Greensky, James Hansen,

Enabling Exploration is not about cost:

The cost of local or remote computation goes with the time used, not the time interval over which this use occurs.

Exploration can be enabled by moving smaller runs from slow local resources to fast supercomputers.

Page 4: LCSE & PSC Demonstration of Exploratory Science through Interactively Driven Supercomputer Simualtions David Porter, Mike Knox, Jim Greensky, James Hansen,

Why has this not already happened?

1.Today’s supercomputers are not efficient on smaller problems.

2.Today’s scheduling of supercomputer runs discourages rapid turn around of smaller runs.

3.Effective interaction with a fast, smaller run requires prompt graphics and prompt response to user decisions.

Page 5: LCSE & PSC Demonstration of Exploratory Science through Interactively Driven Supercomputer Simualtions David Porter, Mike Knox, Jim Greensky, James Hansen,

We have addressed causes 1 and 3:

PSC’s Cray Red Storm system has a fast, low latency interconnect that permits smaller runs to be efficient on the full machine.

The emerging National LambdaRail network permits prompt graphical output at remote user sites and prompt supercomputer response to user commands.

Page 6: LCSE & PSC Demonstration of Exploratory Science through Interactively Driven Supercomputer Simualtions David Porter, Mike Knox, Jim Greensky, James Hansen,

Efficiency of small runs: Implementing small runs on

large systems demands that tasks executed in parallel be small.

Each task1.Reads a data context.2.Operates on the data

context in private.3.Writes the resulting data.4.Selects next task.

Page 7: LCSE & PSC Demonstration of Exploratory Science through Interactively Driven Supercomputer Simualtions David Porter, Mike Knox, Jim Greensky, James Hansen,

Efficiency of small tasks:Computation to

Communication ratio decreases with task size.Smaller data context

means less opportunity for data reuse.

Smaller data context means larger surface to volume ratio.

Smaller data context means smaller vector lengths.

Interconnect is the key item.

Page 8: LCSE & PSC Demonstration of Exploratory Science through Interactively Driven Supercomputer Simualtions David Porter, Mike Knox, Jim Greensky, James Hansen,

Benefits of Cray Red Storm system:

Processors are fast:1 Gflop/s on grid brick update.0.867 Gflop/s with all costs.

Interconnect is fast:several times Myrinet speed.Very low latency.

Mahcine scales to large configuration.2000 CPUs at PSC.

Page 9: LCSE & PSC Demonstration of Exploratory Science through Interactively Driven Supercomputer Simualtions David Porter, Mike Knox, Jim Greensky, James Hansen,

Getting the task size down:To reduce task size while

keeping surface to volume ratio and vector lengths up we might:1.Do just single time step

update2.Or, better, do just single 1-

D pass3.Or, better still, do just

single 1-D grid pencil update.

#3 requires 2-level hierarchy of parallel task management.

Page 10: LCSE & PSC Demonstration of Exploratory Science through Interactively Driven Supercomputer Simualtions David Porter, Mike Knox, Jim Greensky, James Hansen,

Where we are today:Each CPU updates for a single

time step a grid brick of 643 cells.

Takes 1.1 sec. @ 867 Mflop/s/CPU.

A run on a 5123 grid needing 5,000 steps takes 1.5 hours on 512 CPUs

Breaking present tasks into 64 separate 8×8×64 cell updates by teams of 16 CPUs should bring running time down to just 23 min.

Page 11: LCSE & PSC Demonstration of Exploratory Science through Interactively Driven Supercomputer Simualtions David Porter, Mike Knox, Jim Greensky, James Hansen,

Prompt Graphics & Interactive Control:

These 2 functions must go together.

User must be able to generate desired view of desired variable on demand.

Respond on next time step.Stream of graphical images,

not just one snap shot, so can see dynamics.

For speed, have several pre-defined variables and pre-defined views.

See anything desired, with some lag.

Page 12: LCSE & PSC Demonstration of Exploratory Science through Interactively Driven Supercomputer Simualtions David Porter, Mike Knox, Jim Greensky, James Hansen,

User Control Interface:GUI on Windows PC.Modifies contents of small text

file.Start, Pause, Continue, Run To,

Stop.Specifies rectangular solid of

bricks in which graphical data is generated.

Specifies # time steps per dump.

Specifies which variable to view.

Specifies viewing parameters on local system (color, opacity, view).

Page 13: LCSE & PSC Demonstration of Exploratory Science through Interactively Driven Supercomputer Simualtions David Porter, Mike Knox, Jim Greensky, James Hansen,

User Control Interface:Sends commands for run

initialization (parameters of this run)

Can modify parameters of the graphical output (what, where, how often, how viewed locally) on the fly

Can pause, continue, stop, restart run.

Can modify what output is generated for archiving, and where this is sent.

Page 14: LCSE & PSC Demonstration of Exploratory Science through Interactively Driven Supercomputer Simualtions David Porter, Mike Knox, Jim Greensky, James Hansen,

I/O Pipeline, 30 MB/sec PSC to LCSE:

Each CPU on Red Storm writes separate data brick over Portals.

These are concatenated and dispatched over network in 11 streams

These arrive at the user site (LCSE).

Daemon constructs standard HVR volume rendering data file.

Daemon broadcasts HV-files to 10 PowerWall rendering nodes on IB.

Page 15: LCSE & PSC Demonstration of Exploratory Science through Interactively Driven Supercomputer Simualtions David Porter, Mike Knox, Jim Greensky, James Hansen,

PowerWall Image Rendering:10 Dell PC workstations each

render their own portions of a single image.

Several images generated per second at full 13 Mpixel PowerWall resol.

Infiniband network allows data broadcast from node on network to keep up with the data flow.

4 TB local fast disk space on each node can hold accumulating data.

Page 16: LCSE & PSC Demonstration of Exploratory Science through Interactively Driven Supercomputer Simualtions David Porter, Mike Knox, Jim Greensky, James Hansen,

A Brief History of Interactive Simulation with PPM:

1978, 20 min. 1-D run of twin blast wave problem on Cray-1 with prompt line plots on TV in office.

1984, 6 hour weekend day run of 2-D jet on 4-CPU Cray-XMP with contour plots on TV in office, but color graphics only days later on Dicomed film recorder.

Page 17: LCSE & PSC Demonstration of Exploratory Science through Interactively Driven Supercomputer Simualtions David Porter, Mike Knox, Jim Greensky, James Hansen,

A Brief History of Interactive Simulation with PPM:

1988, 30 min. simulation of convection on coarse 2-D grid on Cray YMP in Eagan with prompt graphics in Cray exhibit booth at Supercomputing 88 conference in Florida.D. Porter, P. Woodward, D. Ofelt, U.Mn.; C. Kirchhof, Cray.

Page 18: LCSE & PSC Demonstration of Exploratory Science through Interactively Driven Supercomputer Simualtions David Porter, Mike Knox, Jim Greensky, James Hansen,

A Brief History of Interactive Simulation with PPM:

1993, 15 minute run of flow over deformable obstacle on 2562 grid on 36-CPU SGI Challenge XL server with graphics on console and user interaction at SC93 in Portland.K. Chin-Purcell, D. Porter, P. Woodward, U. Mn.; D. Pero, SGI.

Page 19: LCSE & PSC Demonstration of Exploratory Science through Interactively Driven Supercomputer Simualtions David Porter, Mike Knox, Jim Greensky, James Hansen,

A Brief History of Interactive Simulation with PPM:

1994, 5 minute run of 2-D flow around user alterable obstacle on 1024×512 grid on 512 CM-5 nodes @ 4 Gflop/s with graphics on SGI at AHPCRC.B. K. Edgar, T. Varghese, T. Parr, D. Porter, P. Woodward, U.Mn.;K. Fickie, BRL.

Page 20: LCSE & PSC Demonstration of Exploratory Science through Interactively Driven Supercomputer Simualtions David Porter, Mike Knox, Jim Greensky, James Hansen,

A Brief History of Interactive Simulation with PPM:

2000, 2-D flows on grids of 256×128 on Dell laptop @ 280 Mflop/s in minutes with VB6 GUI.P. Woodward

Page 21: LCSE & PSC Demonstration of Exploratory Science through Interactively Driven Supercomputer Simualtions David Porter, Mike Knox, Jim Greensky, James Hansen,

A Brief History of Interactive Simulation with PPM:

2001, 3-D advection of smoke streams in prescribed hurricane flow field on 32-CPU Itanium cluster @ 16 Gflop/s with prompt 3-D graphics on PC over fast Ethernet at SC2001 in Denver.S. Anderson, D. Porter, P. Woodward, UMn.; R. Wilhelmson, NCSA.

Page 22: LCSE & PSC Demonstration of Exploratory Science through Interactively Driven Supercomputer Simualtions David Porter, Mike Knox, Jim Greensky, James Hansen,

A Brief History of Interactive Simulation with PPM:

2003, 2-D multifluid flow on grid of 512×384 cells on 15-CPU Unisys ES-7000 in 4 minutes @ 6 Gflop/s with prompt graphics to remote location at Mn. state fair.P. Woodward, B. Allen, S. Anderson, D. Porter, UMn.;J. Chase, Fond du Lac.

Page 23: LCSE & PSC Demonstration of Exploratory Science through Interactively Driven Supercomputer Simualtions David Porter, Mike Knox, Jim Greensky, James Hansen,

2300 fairgoers were blown away by the ES7000 at the LCSE in August, 2003.

Page 24: LCSE & PSC Demonstration of Exploratory Science through Interactively Driven Supercomputer Simualtions David Porter, Mike Knox, Jim Greensky, James Hansen,
Page 25: LCSE & PSC Demonstration of Exploratory Science through Interactively Driven Supercomputer Simualtions David Porter, Mike Knox, Jim Greensky, James Hansen,
Page 26: LCSE & PSC Demonstration of Exploratory Science through Interactively Driven Supercomputer Simualtions David Porter, Mike Knox, Jim Greensky, James Hansen,

A Brief History of Interactive Simulation with PPM:

2005, 3-D shear layer run on grid of 963 grid in 45 minutes on Dell laptop @ 1.3 Gflop/s.P. Woodward

Page 27: LCSE & PSC Demonstration of Exploratory Science through Interactively Driven Supercomputer Simualtions David Porter, Mike Knox, Jim Greensky, James Hansen,

A Brief History of Interactive Simulation with PPM:

2005, 3-D shear layer run on grid of 5123 cells in 2 hours on PSC Cray Red Storm (using 512 CPUs) with prompt 3-D graphics on LCSE PowerWall. (well, almost, but this really works in San Diego).This project (see names on title slide).

Page 28: LCSE & PSC Demonstration of Exploratory Science through Interactively Driven Supercomputer Simualtions David Porter, Mike Knox, Jim Greensky, James Hansen,

System now under construction in the LCSE.

Dell PC nodes can act as intelligent storage servers and also as image generation engines.

Dell 670nDual 3.6 Xeon EM64 8GB DDR2 SDRAM

Page 29: LCSE & PSC Demonstration of Exploratory Science through Interactively Driven Supercomputer Simualtions David Porter, Mike Knox, Jim Greensky, James Hansen,

Prototyping Effort Now:have 14 Dell nodes, each with:

Dual P4 Xeon @ 3.6 GHz8 GB memorynVidia Quadro 4400 graphics

card12 Seagate 400 GB SATA

disks3Ware 12-channel SATA

controller Infiniband 4X (Topspin) HCA

10 IB4X links to Unisys ES7000 with 32 Itanium-2 CPUs & 64 GB memory.