Upload
others
View
9
Download
0
Embed Size (px)
Citation preview
April 4-7, 2016 | Silicon Valley
Peter Messmer, 4/4/2016
SCIENTIFIC VISUALIZATION IN HPC
2
"Yes," said Deep Thought, "I can do it."
[Seven and a half million years later.... ]
“The Answer to the Great Question... Of Life, the Universe and Everything...
Is... Forty-two,' said Deep Thought, with infinite majesty and calm.”
— Douglas Adams, Hitchhiker’s Guide to the Galaxy
HIGH PERFORMANCE COMPUTING TODAY*
*mostly
3
Accuracy
Latency
HPC
application
Month
Week
Day
Hour
5 min
100 ms
30 ms
10 ms
Sit in it Has Engine Moves Flies
4
Accuracy
Latency
HPC
application
Action
Game
Month
Week
Day
Hour
5 min
100 ms
30 ms
10 ms
Sit in it Has Engine Moves Flies
5
Accuracy
Latency
HPC
application
Action
Game
Month
Week
Day
Hour
5 min
100 ms
30 ms
10 ms
Sit in it Has Engine Moves Flies
Flight
Simulator
CG Movie
6
Accuracy
Latency
HPC
application
Action
Game
Month
Week
Day
Hour
5 min
100 ms
30 ms
10 ms
Sit in it Has Engine Moves Flies
Flight
Simulator
CG Movie
Parameter Space
Exploration,
Approximate models
7
Accuracy
Latency
HPC
application
Action
Game
Month
Week
Day
Hour
5 min
100 ms
30 ms
10 ms
Sit in it Has Engine Moves Flies
Opportunity!
Flight
Simulator
CG Movie
Parameter Space
Exploration,
Approximate models
Explorative Science,
Real-time systems
8
BONSAI WITH IN-SITU VIZ ON PIZ DAINT
Presented at SC14, streaming from CSCS/Switzerland to New Orleans Presented at SC14, streaming from CSCS/Switzerland to New Orleans
Compute & Vis on 1024 GPU nodes Live Streaming
J. Bedorf, E. Gaburov, P.Messmer, S. Portegies Zwart
9
BONSAI WITH IN-SITU VIZ ON PIZ DAINT
Presented at SC14, streaming from CSCS/Switzerland to New Orleans Presented at SC14, streaming from CSCS/Switzerland to New Orleans
Compute & Vis on 1024 GPU nodes Live Streaming
10
11
Coordinate
transformations
Feature
extraction
Thresholding
Isosurfaces,
Isovolumes
Streamlines
Field Operators
(Gradient, Curl,.. )
Clip, Slice
Binning,
Resample
Surface
Rendering
Volume
Rendering Line
Rendering
Compositing
VISUALIZATION ≠ RENDERING * * but it’s a part of it
12
VISUALIZATION PIPELINE
- Analysis: Data processing to extract meaningful quantities of interest
- Filtering: Conversion of simulation data into data ready for rendering
- Rendering: Conversion of shapes to pixels (Fragment processing)
- Compositing: Combination of independently generated pixels into final frame
Your typical scientific visualization system
Simulation
Visualization
Analysis& Filtering
Rendering Compositing Delivery
13
VISUALIZATION TOOLKIT (VTK)
Focus on visualization, not (only) rendering
Provides more complex operations on data (“filtering”)
Visualization pipeline
At the core of many high-level viz tools
Paraview, Visit, ..
Developed by Kitware, open source
http://www.vtk.org
Venerable backbone of scientific visualization
Tue - 15:00 : S6193 - Visualization Toolkit: Improving Rendering and Compute on GPU's
14
VTK-M
Ongoing development (Sandia, Kitware, ORNL, ..)
VTK type filters and more fine-grained “worklets”
Platform portable (GPU, multicore CPU)
Thrust, TBB backend
http://m.vtk.org
Visualization algorithms on modern architectures
Wed - 15:00 : S6352 - Adapting the Visualization Toolkit for Many-Core Processors with the VTK-m Library
15
TYPICAL HPC ENVIRONMENT
Workstation on scientist’s desk
Remote HPC center
Compute nodes not directly accessible
Output from compute nodes:
- File transfer
- X forwarding
- Remote rendering
Login Node
Workstation
GPU
Compute Node
GPU Compute Node
GPU
16
X-FORWARDING
ssh –Y loginnode.edu
ssh –Y computenode
No extra process on compute node
Rendering by workstation GPU
X server on workstation needed
Often prohibitively slow
Be prepared for latencies
Login Node
Workstation
GPU
Compute Node
GPU Compute Node
GPU
17
REMOTE RENDERING Know your latencies
Login Node
Workstation
GPU
Compute Node
GPU Compute Node
GPU
Use compute node’s GPU for rendering
Capture renderings and ship pixel data to user
Compression is key
Requires running X server on compute node*
* Requirements will change with EGL
18
Stellar combustion visualized on
Blue Waters (26 TB dataset)
Remote visualization on Blue Waters
Paul Woodward, U. Minnesota: HVR w/ OpenGL on Blue Waters
Improvement in time to solution
6 GPUs in local viz cluster
128 GPUs in HPC center
Data transfer Rendering 48 days
1 day
• Limited resources in the local viz cluster
• Long data transfer times
48x speed ups by using the Tesla GPUs
in the HPC center
18 Data courtesy of John Stone, UIUC
19
REMOTE RENDERING: VIDEO ENCODING
Open source approaches: TurboVNC + VirtualGL
Currently not leveraging HW H264 encoder
Commercial tools (e.g. Nice DCV)
Leverages HW encoder
No application modification needed
https://www.nice-software.com/products/dcv
NvENC library to access HW H264 encoder (lossless on Maxwell)
https://developer.nvidia.com/nvidia-video-codec-sdk
Interactivity over large distances
Tue - 13:00: S6253 - VMD: Petascale Molecular Visualization and Analysis with Remote Video Streaming
20
OPENGL: GPU ACCELERATED RENDERING
•Primitives: points, lines, polygons
•Properties: colors, lighting, textures, ..
•View: camera position and perspective
•Shaders: Rendering to screen/framebuffer
•C-style functions, enums
See e.g. “What Every CUDA Programmer Should Know About OpenGL”
(http://www.nvidia.com/content/GTC/documents/1055_GTC09.pdf)
Mon - 10:00 : S6817 - High-Performance, Low-Overhead Rendering with OpenGL and Vulkan
Mon - 14:00 : H6139 - Hangout: Maximizing Performance of CUDA and OpenGL Applications
21
VIS TOOLS EMBRACE OPENGL ON EGL
Prior to EGL: X server required for GPU accelerated OpenGL
Full OpenGL on EGL announced at SC16
With EGL: OpenGL without X
Major enabler for GPU rendering in HPC, incl. Cray systems*
Quick adoption by vis tool developers
https://devblogs.nvidia.com/parallelforall/egl-eye-opengl-visualization-without-x-server/
* Driver version 358.7 or newer required
Streamlined GPU accelerated off-screen rendering
4/20/2016
https://devblogs.nvidia.com/parallelforall/egl-eye-opengl-visualization-without-x-server/https://devblogs.nvidia.com/parallelforall/egl-eye-opengl-visualization-without-x-server/https://devblogs.nvidia.com/parallelforall/egl-eye-opengl-visualization-without-x-server/https://devblogs.nvidia.com/parallelforall/egl-eye-opengl-visualization-without-x-server/https://devblogs.nvidia.com/parallelforall/egl-eye-opengl-visualization-without-x-server/https://devblogs.nvidia.com/parallelforall/egl-eye-opengl-visualization-without-x-server/https://devblogs.nvidia.com/parallelforall/egl-eye-opengl-visualization-without-x-server/https://devblogs.nvidia.com/parallelforall/egl-eye-opengl-visualization-without-x-server/https://devblogs.nvidia.com/parallelforall/egl-eye-opengl-visualization-without-x-server/https://devblogs.nvidia.com/parallelforall/egl-eye-opengl-visualization-without-x-server/https://devblogs.nvidia.com/parallelforall/egl-eye-opengl-visualization-without-x-server/https://devblogs.nvidia.com/parallelforall/egl-eye-opengl-visualization-without-x-server/https://devblogs.nvidia.com/parallelforall/egl-eye-opengl-visualization-without-x-server/
22
USE CASE: CSI ENSIGHT 10.1.6D
“Customers post-process large collections of simulations offline
Even though rendering takes place offscreen, EnSight requires that an X server is running, and is open enough to allow users to access it. At some sites, it is unacceptable to configure an X server to have wide open access.
By using EGL to make our GL context and pbuffer, we can remove the need to start an Xserver, and solve all of these system management problems.”
Dave Bremer, CEI
EGL for batch renderer
4/20/2016
Image by Astec Inc
23
PARAVIEW: DESIGNED FOR HPC
Supports all major scientific data formats
Extensive collection of “operators”: isosurface, streamlines, volume renderer, reductions, custom operators, ..
Approachable user interface
Supports remote, distributed memory vis
Open source, free
Extensible via plugins
Satisfying scientific computing needs
4/20/2016
24
MODERN OPENGL FOR HPC VIZ
VTK supports now OpenGL 3.2
Enables advanced shaders (AO, VXGI, ..)
Some algorithms well suited for
distributed memory rendering
GPU hardware support
Mandatory to access advanced rendering features
Data courtesy Florida Intl University & TACC
25
OPENGL RENDERING POWERHOUSE OpenGL vs OpenSWR
Big
ger
is b
ett
er
26
OPENGL NOT LIMITED TO RENDERING TASKS
CUDA->OpenGL typically one-way only
EGL enables lighter weight access to OpenGL
No X server needed
Potential use of OpenGL for rasterization-like problems?
Determine covered “pixels”
3D ordering/occlusion via Z-buffer
Interop goes both ways, esp with EGL
27
SCALABLE RENDERING AND COMPOSITING
Large-scale (volume) data visualization
Interactive visualization of TB of data
Stand-alone or coupling into simulation
HW Accelerated remote rendering
Plugin for ParaView
http://www.nvidia-arc.com/products/nvidia-index.html
NVIDIA INDEX
Mon - 10:00: S6590 - HPC Visualization Using NVIDIA IndeX™
28
RC coming out soon.. Email for enquiries
SCALABLE VOLUME RENDERING IN PARAVIEW
Index plugin addresses shortcomings in ParaView built-in volume renderer
Beta version supports
- 3D structured, scalar grids
- 32bit float, 16 bit/8bit uint
- Overlay of opaque ParaView geometries (e.g. streamlines)
Free plugin, requires commercial IndeX license
Plugin enables GPU accelerated volume rendering
4/20/2016 Tue - 16:00 : S6670 - Toward Bridging the Gap Between High Quality and High Performance for HPC Visualization
29
Advanced Rendering in scientific visualization
Two lights, no shadows
Two lights,
hard shadows, 1 shadow
ray per light
Ambient occlusion + two
lights, 144 AO rays/hit
• Ray tracing offers ambient occlusion lighting, shadows, high quality transparent surfaces
Better insight with visual cues
Courtesy of John Stone, UIUC
30
OPTIX RAY TRACING FRAMEWORK
•GPU accelerated ray-tracing framework
•Build your own RT application
•Generic Ray-Geometry interaction
•Rays with arbitrary payloads
•Multi-GPU-support
Tue - 14:00 : H6148 - Hangout: CUDA for HPC Simulation and Visualization Tue - 14:00 : H6150 - Hangout: OptiX Ray Tracing Library: Best practices and Use-Case Consultation Wed -10:30 S: S6320 - Opticks: Optical Photon Simulation for High Energy Physics with NVIDIA OptiX™
31
TELLING A BETTER STORY, VISUALLY
Advanced rendering can help visual message, e.g. guiding the eye via depth of field
Particularly useful for complex visualizations
Interactive ray-tracing via NVIDIA Iray
Post-processing of ParaView files
Advanced rendering improves messaging
Mon - 15:00 : H6142A - Hangout: Iray® Rendering for Developers
32
PARALLEL COMPOSITING WITH ICE-T
•Each node renders fraction of image
•Sort last compositing
•Widely used (Paraview, VisIt .. )
•Critical element for real-time viz
•Up to 30 fps for 4k frames on 1024 nodes
•Cray XC30, Piz Daint @ CSCS
http://icet.sandia.gov
Tue- 14:30 : S6808 - Image Compositing on GPU-Accelerated Supercomputers
Modern networks remove compositing bottleneck
http://icet.sandia.gov/http://icet.sandia.gov/
33
HIGH FRAMERATE = MINIMAL IMPACT ON SIMULATION
Real-time visualization only one use case
Batch processing will not go away
Acceptable time budget for visualization/analysis
Up to the I/O time, ~ 2 %
More diagnostics in the same time
E.g. ParaView Cinema
FPS matter, even in HPC
34
VISUALIZATION-ENABLED SUPERCOMPUTERS
http://blogs.nvidia.com/blog/2014/11/19/gpu-in-
situ-milky-way/
CSCS Piz Daint NCSA Blue Waters
Galaxy formation
http://devblogs.nvidia.com/parallelforall/hpc-visualization-nvidia-tesla-gpus/
ORNL Titan
Molecular dynamics
Cosmology
http://www.sdav-scidac.org/29-highlights/visualization/66-accelerated-cosmology-
data-anal.html
http://blogs.nvidia.com/blog/2014/11/19/gpu-in-situ-milky-way/http://blogs.nvidia.com/blog/2014/11/19/gpu-in-situ-milky-way/http://blogs.nvidia.com/blog/2014/11/19/gpu-in-situ-milky-way/http://blogs.nvidia.com/blog/2014/11/19/gpu-in-situ-milky-way/http://blogs.nvidia.com/blog/2014/11/19/gpu-in-situ-milky-way/http://blogs.nvidia.com/blog/2014/11/19/gpu-in-situ-milky-way/http://blogs.nvidia.com/blog/2014/11/19/gpu-in-situ-milky-way/http://blogs.nvidia.com/blog/2014/11/19/gpu-in-situ-milky-way/http://blogs.nvidia.com/blog/2014/11/19/gpu-in-situ-milky-way/https://www.google.com/url?sa=i&rct=j&q=&esrc=s&frm=1&source=images&cd=&cad=rja&uact=8&ved=0CAcQjRxqFQoTCK6c9KzT4sYCFY8UkgodQM8IpQ&url=https://bluewaters.ncsa.illinois.edu/&ei=FzapVe60Eo-pyATAnqOoCg&bvm=bv.98197061,d.aWw&psig=AFQjCNG6_XCiXpDJ0HATm6rT-rbi0v67CQ&ust=1437239182216512http://devblogs.nvidia.com/parallelforall/hpc-visualization-nvidia-tesla-gpus/http://devblogs.nvidia.com/parallelforall/hpc-visualization-nvidia-tesla-gpus/http://devblogs.nvidia.com/parallelforall/hpc-visualization-nvidia-tesla-gpus/http://devblogs.nvidia.com/parallelforall/hpc-visualization-nvidia-tesla-gpus/http://devblogs.nvidia.com/parallelforall/hpc-visualization-nvidia-tesla-gpus/http://devblogs.nvidia.com/parallelforall/hpc-visualization-nvidia-tesla-gpus/http://devblogs.nvidia.com/parallelforall/hpc-visualization-nvidia-tesla-gpus/http://devblogs.nvidia.com/parallelforall/hpc-visualization-nvidia-tesla-gpus/http://devblogs.nvidia.com/parallelforall/hpc-visualization-nvidia-tesla-gpus/https://www.google.com/url?sa=i&rct=j&q=&esrc=s&frm=1&source=images&cd=&cad=rja&uact=8&ved=0CAcQjRxqFQoTCIv4w43Z4sYCFcmDkAodQKkKIw&url=https://www.olcf.ornl.gov/titan/&ei=IDypVYvoBsmHwgTA0qqYAg&bvm=bv.97949915,d.Y2I&psig=AFQjCNFBZRN7n7brVSlelqbBCnTlZ96imw&ust=1437240647535095http://www.sdav-scidac.org/29-highlights/visualization/66-accelerated-cosmology-data-anal.htmlhttp://www.sdav-scidac.org/29-highlights/visualization/66-accelerated-cosmology-data-anal.htmlhttp://www.sdav-scidac.org/29-highlights/visualization/66-accelerated-cosmology-data-anal.htmlhttp://www.sdav-scidac.org/29-highlights/visualization/66-accelerated-cosmology-data-anal.htmlhttp://www.sdav-scidac.org/29-highlights/visualization/66-accelerated-cosmology-data-anal.htmlhttp://www.sdav-scidac.org/29-highlights/visualization/66-accelerated-cosmology-data-anal.htmlhttp://www.sdav-scidac.org/29-highlights/visualization/66-accelerated-cosmology-data-anal.htmlhttp://www.sdav-scidac.org/29-highlights/visualization/66-accelerated-cosmology-data-anal.htmlhttp://www.sdav-scidac.org/29-highlights/visualization/66-accelerated-cosmology-data-anal.htmlhttp://www.sdav-scidac.org/29-highlights/visualization/66-accelerated-cosmology-data-anal.htmlhttp://www.sdav-scidac.org/29-highlights/visualization/66-accelerated-cosmology-data-anal.htmlhttp://www.sdav-scidac.org/29-highlights/visualization/66-accelerated-cosmology-data-anal.htmlhttp://www.sdav-scidac.org/29-highlights/visualization/66-accelerated-cosmology-data-anal.html
35
CO-PROCESSING PARTITIONED
SYSTEM LEGACY
WORKFLOW
COMPUTE+VIS SUPPORTS MULTIPLE WORKFLOWS
Separate compute & vis system
Communication via file system
Compute and visualization on same GPU
Communication via host-device transfers or memcpy
Different nodes for different roles
Communication via high-speed network
36
IN-SITU VIS: ADVANTAGES AND OPPORTUNITIES
- Minimize IO traffic
- Exploit data locality
- Less pressure on file system
- Less time wasted in I/O
- Reduce latency to first result
- Monitoring, early termination
- Enable real-time/interactive visualization
- Novel workflows, new applications
Workstation
File System
GPU-accelerated Supercomputer
Tue – 9:00: S6633 - Navigating the In-Situ Visualization Landscape
37
COSMO WITH IN-SITU VISUALIZATION
COSMO-1 model, operational weather model
in Switzerland
6 nodes Cray CS-Storm system
8 K80 GPUs/node
96 GPU sockets total
~ 20s per 0.7s
NVIDIA IndeX for visualizaiton
Tue - 13:30: S6628 - Co-Designing GPU-Based Systems and Tools for Numerical Weather Predictions
Live, Interactive Weather Simulation
38
IN-SITU VISUALIZATION ON TITAN
“When running PyFR at scale, it
generates very large data sets that
need analyzing for acoustics. The
traditional post hoc method is simply
not fit for purpose – in situ
visualization and processing are
critical. We see a potential for 50x
speed ups with in situ, which
significantly accelerates our scientific
discovery”
First prototype of ParaView in-situ
visualization capabilities in pyFR (CFD)
simulations, predicting jet engine acoustics
Both compute and visualization running
on Titan GPUs and streaming to a remote
location
- Dr. Peter Vincent Imperial College
Thu-10:30 : S6329 - Petascale Computational Fluid Dynamics with Python on GPUs Tue-15:00 : S6193 - Visualization Toolkit: Improving Rendering and Compute on GPU's
39
VISUALIZATION IN HPC
Leverage graphics capabilities on heterogeneous nodes
GPUs offer features for visualization, rendering, remote viz
Modern networks enable parallel compositing at massive scale
Graphics capabilities may help for graphics-like algorithms
Supported by popular visualization tools
Fast rendering relevant even for batch processing
In-situ visualization for monitoring, steering, and other novel workflows
April 4-7, 2016 | Silicon Valley
THANK YOU
JOIN THE NVIDIA DEVELOPER PROGRAM AT developer.nvidia.com/join
developer.nvidia.com/join