56
Large Scale Large Scale Visualization using Visualization using PC Clusters PC Clusters ClusterWorld ClusterWorld 2003 2003 Brian Wylie Brian Wylie Sandia National Laboratories Sandia National Laboratories Sandia is a multiprogram laboratory operated by Sandia Corporation, a Lockheed Martin Company, for the United States Department of Energy under contract DE-AC04-94AL85000.

Large Scale Visualization using PC Clusters · Large Scale Visualization using PC Clusters ... 200 time steps ... “Cluster to Wall with VTK,” Parallel and Large Data Volume

Embed Size (px)

Citation preview

Page 1: Large Scale Visualization using PC Clusters · Large Scale Visualization using PC Clusters ... 200 time steps ... “Cluster to Wall with VTK,” Parallel and Large Data Volume

Large Scale Large Scale Visualization using Visualization using PC ClustersPC Clusters

ClusterWorld ClusterWorld 20032003

Brian WylieBrian WylieSandia National LaboratoriesSandia National Laboratories

Sandia is a multiprogram laboratory operated by Sandia Corporation, a Lockheed Martin Company,for the United States Department of Energy under contract DE-AC04-94AL85000.

Page 2: Large Scale Visualization using PC Clusters · Large Scale Visualization using PC Clusters ... 200 time steps ... “Cluster to Wall with VTK,” Parallel and Large Data Volume

OutlineOutline

Hardware PlatformHardware Platform

Software PlatformSoftware Platform

Data DistributionData Distribution

Parallel RenderingParallel Rendering

Parallel Volume RenderingParallel Volume Rendering

Other TechniquesOther Techniques

Page 3: Large Scale Visualization using PC Clusters · Large Scale Visualization using PC Clusters ... 200 time steps ... “Cluster to Wall with VTK,” Parallel and Large Data Volume

Hardware PlatformHardware Platform

Page 4: Large Scale Visualization using PC Clusters · Large Scale Visualization using PC Clusters ... 200 time steps ... “Cluster to Wall with VTK,” Parallel and Large Data Volume

Visualization ClustersVisualization Clusters

Wilson (Classified)64 nodes

800 MHz P3 CPU.GeForce3 cards.

Myrinet 2000 interconnect

Europa (Unclassified)128/256 Dell Workstations

Dual 2.0 GHz P4 XeonGeForce3 cards.

Myrinet 2000 interconnect1.27 TFLOP on Linpack..#32 on Top 500#32 on Top 500

Wilson

Europa

Page 5: Large Scale Visualization using PC Clusters · Large Scale Visualization using PC Clusters ... 200 time steps ... “Cluster to Wall with VTK,” Parallel and Large Data Volume

DLP Projector AlignmentMarch 2, 2002

‘Blue Girl’ Color Test ImageFirst 48-Projector Alignment(March 2, 2002)

LLNL PPM DatasetFirst 48-tile Rendering

470M Polygons w/64 nodes(April 5, 2002)

VIEWS Visualization CorridorVIEWS Visualization Corridor

Page 6: Large Scale Visualization using PC Clusters · Large Scale Visualization using PC Clusters ... 200 time steps ... “Cluster to Wall with VTK,” Parallel and Large Data Volume

VIEWS Visualization CorridorVIEWS Visualization Corridor

Display ScreensSeamless 12m x 3m RP3 sections - each 4x4 array 0.5” Stewart FilmScreen AeroGlass100

Projection Systems3 16-projector arraysPrimary Projectors are DPI1280x1024, 3500 Lumen,3-chip DLP (DMD)

Data and Visualization “Corridor” metaphor :

A wide path through which massive quantities of data can easily flow, and through which scientists and engineers can explore data and collaborate.

Page 7: Large Scale Visualization using PC Clusters · Large Scale Visualization using PC Clusters ... 200 time steps ... “Cluster to Wall with VTK,” Parallel and Large Data Volume

Software PlatformSoftware Platform

Page 8: Large Scale Visualization using PC Clusters · Large Scale Visualization using PC Clusters ... 200 time steps ... “Cluster to Wall with VTK,” Parallel and Large Data Volume

Software PlatformSoftware Platform

Use the Visual ToolKit(VTK) as a framework

Only replace small parts of the framework with our research codes; the rest is tried and tested by a whole community,

VTK is criticized as being too slow… If necessary we will rewrite the slow parts.

Page 9: Large Scale Visualization using PC Clusters · Large Scale Visualization using PC Clusters ... 200 time steps ... “Cluster to Wall with VTK,” Parallel and Large Data Volume

Example: Poly Data Example: Poly Data MappersMappers

New classes perform significantly better:New classes perform significantly better:Standard VTK Standard VTK mappermapper: 4.5 : 4.5 MtriMtri/sec./sec.New generic 3D accelerated New generic 3D accelerated mappermapper: 9.5 : 9.5 MtriMtri/sec./sec.New New nVidia nVidia accelerated accelerated mappermapper: 20 : 20 MtriMtri/sec./sec.

New New mappers mappers available in ParaView now (through factory available in ParaView now (through factory method and dynamic loading).method and dynamic loading).

More accelerated modules to become part of the More accelerated modules to become part of the supercharged “GT” component project.supercharged “GT” component project.

Page 10: Large Scale Visualization using PC Clusters · Large Scale Visualization using PC Clusters ... 200 time steps ... “Cluster to Wall with VTK,” Parallel and Large Data Volume

VTK ContributionsVTK Contributions

3459344Lee Ann Fisk

Crimes:D3Possession with intent

to Distribute

9345325Brian Wylie

Crimes:Shadow renderingRacketeering

9023842Gary Templet

Crimes:vtkSNL Build SystemRepository MaintenanceAiding and Abetting

8934592David Thompson

Crimes:HALOChromiumLaundering

0235098Kenneth Moreland

Crimes:Rendering – Serial,

Parallel, VolumetricPublic Indecency

The Sandia The Sandia VisVis GangGang

Page 11: Large Scale Visualization using PC Clusters · Large Scale Visualization using PC Clusters ... 200 time steps ... “Cluster to Wall with VTK,” Parallel and Large Data Volume

VTK InfoVTK Info

As of the current release (VTK 4.2) there are approximately…

107 classes that read and write data.

317 classes that filter datasets. (Isosurface, Streamlines, etc)

145 classes that “map” the dataset into an image.

50 classes that support parallel processing

Page 12: Large Scale Visualization using PC Clusters · Large Scale Visualization using PC Clusters ... 200 time steps ... “Cluster to Wall with VTK,” Parallel and Large Data Volume

Distributed ParallelismDistributed Parallelism

Parallel Processing with VTKParallel Processing with VTK

Page 13: Large Scale Visualization using PC Clusters · Large Scale Visualization using PC Clusters ... 200 time steps ... “Cluster to Wall with VTK,” Parallel and Large Data Volume

Data SizeData Size

Visible Woman CT Data870 MBytes 1734 Slices at 512x512x2Visible Woman CT DataVisible Woman CT Data870870 MBytesMBytes 1734 Slices at 512x512x21734 Slices at 512x512x2

Flow Computation: Flow Computation: Robert Robert Meakin Meakin

Visualization: Visualization: David David KenwrightKenwright and and David Lane David Lane

Numerical Aerodynamic Numerical Aerodynamic Simulation Division at NASA Simulation Division at NASA Ames Research CenterAmes Research Center

• Bell-Boeing V-2 2 tiltrotor140 Gbytes

• Bell-Boeing V-2 2 tiltrotor140 Gbytes

Page 14: Large Scale Visualization using PC Clusters · Large Scale Visualization using PC Clusters ... 200 time steps ... “Cluster to Wall with VTK,” Parallel and Large Data Volume

Other Large DataOther Large Data

Modeling turbulence (Ken Jansen RPI)Modeling turbulence (Ken Jansen RPI)8.5 million tetrahedra, 200 time steps8.5 million tetrahedra, 200 time steps150 million tetrahedra, 2000 time steps (soon)150 million tetrahedra, 2000 time steps (soon)

Page 15: Large Scale Visualization using PC Clusters · Large Scale Visualization using PC Clusters ... 200 time steps ... “Cluster to Wall with VTK,” Parallel and Large Data Volume

VTK WeaknessesVTK WeaknessesParallel Data Distribution (inefficient, poor load balancing,etcParallel Data Distribution (inefficient, poor load balancing,etc))

Rendering (Serial and Parallel) (really not good all the way aroRendering (Serial and Parallel) (really not good all the way around) und)

DataDataReadersReaders

(not smart)(not smart)

DataDataReadersReaders

(not smart)(not smart)

DataDataReadersReaders

(not smart)(not smart)

DataDataReadersReaders

(not smart)(not smart)

D3D3 D3D3 D3D3 D3D3

Full VTKFull VTKPipelinePipeline

D3 is the backbone of our D3 is the backbone of our parallel VTK architecture. parallel VTK architecture.

Full VTKFull VTKPipelinePipeline

Full VTKFull VTKPipelinePipeline

Full VTKFull VTKPipelinePipeline

OptimizedOptimizedRendererRenderer

OptimizedOptimizedRendererRenderer

OptimizedOptimizedRendererRenderer

OptimizedOptimizedRendererRenderer

ICEICE--TTChromiumChromium

ICEICE--TTChromiumChromium

ICEICE--TTChromiumChromium

ICEICE--TTChromiumChromium

ICEICE--T and Chromium based T and Chromium based parallel VTK rendering.parallel VTK rendering.

Page 16: Large Scale Visualization using PC Clusters · Large Scale Visualization using PC Clusters ... 200 time steps ... “Cluster to Wall with VTK,” Parallel and Large Data Volume

SoftwareSoftware

Load BalancingLoad BalancingD3 (Distributed Data Decomposition)D3 (Distributed Data Decomposition)

Scalable RenderingScalable RenderingImage Compositing Engine for Tiles (ICEImage Compositing Engine for Tiles (ICE--T)T)

Unstructured Volume RenderingUnstructured Volume RenderingGpu Gpu Accelerated Tetrahedral Rendering (Accelerated Tetrahedral Rendering (GAToRGAToR))Upcoming Parallel WorkUpcoming Parallel Work

Other TechniquesOther TechniquesHigher Order ElementsHigher Order ElementsReal Time ShadowsReal Time Shadows

Page 17: Large Scale Visualization using PC Clusters · Large Scale Visualization using PC Clusters ... 200 time steps ... “Cluster to Wall with VTK,” Parallel and Large Data Volume

Load BalancingLoad Balancing

D3 (Distributed Data Decomposition)D3 (Distributed Data Decomposition)

Spatial decomposition based on K-d tree

Spatial regions contain approximately equal number of mesh elements.

Fast execution for tree queries.

Axis aligned nature of boundaries can accelerate processing for some visualization algorithms.

Page 18: Large Scale Visualization using PC Clusters · Large Scale Visualization using PC Clusters ... 200 time steps ... “Cluster to Wall with VTK,” Parallel and Large Data Volume

Load BalancingLoad BalancingChallengesChallenges

Sorting the data is computationally prohibitive.Sorting the data is computationally prohibitive.

Use median finding algorithm (Select).Use median finding algorithm (Select).

Parallel implementation of Select is straightforward, although scalable implementation was some work

Parallel K-d tree build: Sub-groups of processors build sub-trees for sub-regions of the volume.

Bookkeeping (VTK data structures, attributes, ghost cells, etc).

Page 19: Large Scale Visualization using PC Clusters · Large Scale Visualization using PC Clusters ... 200 time steps ... “Cluster to Wall with VTK,” Parallel and Large Data Volume

D3 FeaturesD3 Features

K-d Tree BuildDepth of k-d treeAsk for shafts, slabs, or slices instead of blocks

DistributionAssign spatially contiguous regions. (Normal mode)Assign regions in a round robin fashion.Assign regions to minimize data movement. *Assign regions to processors according to a user supplied mapping.

Page 20: Large Scale Visualization using PC Clusters · Large Scale Visualization using PC Clusters ... 200 time steps ... “Cluster to Wall with VTK,” Parallel and Large Data Volume

D3 FeaturesD3 Features

Output OptionsvtkUnstructuredGrid (maybe others later)Ghost level - include ghost cells bordering spatial region.Omit duplicate points (reading 500 disk files).

Input Options (M x N)M > N : 512 files with 16 Visualization nodes (Typical).M < N : 16 files and 128 Vis nodes (more tricky).Out of core option? Perhaps our ongoing VTK work with OGI (Claudio Silva) will provide that functionality.

Page 21: Large Scale Visualization using PC Clusters · Large Scale Visualization using PC Clusters ... 200 time steps ... “Cluster to Wall with VTK,” Parallel and Large Data Volume

Dataset info: 241,600 cells 40 files. Contiguous spatial regions assigned to processors.

D3 PicturesD3 Pictures

Page 22: Large Scale Visualization using PC Clusters · Large Scale Visualization using PC Clusters ... 200 time steps ... “Cluster to Wall with VTK,” Parallel and Large Data Volume

Round Robin region assignment

D3 PicturesD3 Pictures

Page 23: Large Scale Visualization using PC Clusters · Large Scale Visualization using PC Clusters ... 200 time steps ... “Cluster to Wall with VTK,” Parallel and Large Data Volume

Scalable Rendering Scalable Rendering Technical ChallengesTechnical Challenges

Use of Commodity ComponentsUse of Commodity ComponentsDesktop PC’sDesktop PC’sGamer’sGamer’s graphics cardsgraphics cards

Graphics Cluster ThroughputGraphics Cluster ThroughputLoad BalancingLoad BalancingEffective use of aggregate performanceEffective use of aggregate performanceNetworking issuesNetworking issues

Tiled DisplayTiled DisplayDriving virtual display from PC clusterDriving virtual display from PC cluster

Page 24: Large Scale Visualization using PC Clusters · Large Scale Visualization using PC Clusters ... 200 time steps ... “Cluster to Wall with VTK,” Parallel and Large Data Volume

Scalable Rendering MetricsScalable Rendering Metrics

Focus is on large data.Focus is on large data.

Algorithms must adhere to the following criteria:Algorithms must adhere to the following criteria:Excellent load balancing. Excellent load balancing. Scale to large number of nodes.Scale to large number of nodes.Insensitive to data size (fixed overhead).Insensitive to data size (fixed overhead).

Willing to trade off frame rates to enable large data Willing to trade off frame rates to enable large data rendering.rendering.

Page 25: Large Scale Visualization using PC Clusters · Large Scale Visualization using PC Clusters ... 200 time steps ... “Cluster to Wall with VTK,” Parallel and Large Data Volume

Image Composition for Tiles ?!?Image Composition for Tiles ?!?

Yes, it’s Crazy…but is it Crazy enough?Yes, it’s Crazy…but is it Crazy enough?

Questions:Questions:Why use image compositing for a 62 Million pixel display?Why use image compositing for a 62 Million pixel display?Wouldn’t sort first be the obvious choice.Wouldn’t sort first be the obvious choice.

Constraints:Constraints:Each graphics adapter renders a 1280x1024 imageEach graphics adapter renders a 1280x1024 imageTarget data is huge so…Target data is huge so…

Data must remain stationary (no network per frame).Data must remain stationary (no network per frame).No data replication.No data replication.

1/N of the data on N computers.1/N of the data on N computers.

Page 26: Large Scale Visualization using PC Clusters · Large Scale Visualization using PC Clusters ... 200 time steps ... “Cluster to Wall with VTK,” Parallel and Large Data Volume

0

200

400

600

800

1000

0 2 4 6 8 10 12Input Geometry Size (GB)

Net

wor

k U

sage

/Fra

me

(MB

)

Sort-First Sort-First (with 90% cache) ICE-T

Cross OverCross Over

Data Transfer vs. Image TransferData Transfer vs. Image Transfer

ASCI-Sized Data

Page 27: Large Scale Visualization using PC Clusters · Large Scale Visualization using PC Clusters ... 200 time steps ... “Cluster to Wall with VTK,” Parallel and Large Data Volume

Virtual Trees StrategyVirtual Trees Strategy

Page 28: Large Scale Visualization using PC Clusters · Large Scale Visualization using PC Clusters ... 200 time steps ... “Cluster to Wall with VTK,” Parallel and Large Data Volume

Performance BoostersPerformance BoostersInherently ‘geometry’ load balanced.Inherently ‘geometry’ load balanced.

Composition responsibilities can be automatically adjusted basedComposition responsibilities can be automatically adjusted basedthe ‘load’ of the node.the ‘load’ of the node.

Active Pixel EncodingActive Pixel Encoding

Fast encoding.Fast encoding.–– Three operations per pixel.Three operations per pixel.

Free decoding. Faster depth compare.Free decoding. Faster depth compare.

Effective compression.Effective compression.–– Encoded 1/5 full image at beginning.Encoded 1/5 full image at beginning.

Good worst case behavior.Good worst case behavior.–– Encoded image can only grow a few bytes.Encoded image can only grow a few bytes.

Page 29: Large Scale Visualization using PC Clusters · Large Scale Visualization using PC Clusters ... 200 time steps ... “Cluster to Wall with VTK,” Parallel and Large Data Volume

ICEICE--T ResultsT Results

Good load balancing characteristicsGood load balancing characteristics

Performs well on large datasetsPerforms well on large datasets

Excellent ScalabilityExcellent Scalability

VTK Rendering resultsVTK Rendering resultsOver 1 Billion triangles/sec for desktop delivery Over 1 Billion triangles/sec for desktop delivery

(128 node cluster).(128 node cluster).60 million triangles/sec on 60 60 million triangles/sec on 60 Mpixel Mpixel display. display.

(64 node cluster)(64 node cluster)

““Scalable Rendering on PC ClustersScalable Rendering on PC Clusters”” Computer Graphics and Computer Graphics and ApplicationsApplications, Large Data Visualization Issue, July/Aug 2001., Large Data Visualization Issue, July/Aug 2001.““SortSort--Last Tiled Rendering for Viewing Extremely Large Datasets on TilLast Tiled Rendering for Viewing Extremely Large Datasets on Tiled ed DisplaysDisplays””, , IEEE Parallel and Large Data Visualization and Graphics, IEEE Parallel and Large Data Visualization and Graphics, San San Diego, CA, 2001Diego, CA, 2001

Page 30: Large Scale Visualization using PC Clusters · Large Scale Visualization using PC Clusters ... 200 time steps ... “Cluster to Wall with VTK,” Parallel and Large Data Volume

Leveraging Leveraging GPUsGPUs

Why implement algorithms onWhy implement algorithms on GPUsGPUs??

GPU performance increases have consistently GPU performance increases have consistently outpacedoutpaced Moore’sMoore’s Law.Law.

GPU’sGPU’s are cheap.are cheap.

Balance computations between the CPU and GPU.Balance computations between the CPU and GPU.

Marketing numbers onMarketing numbers on GeForceGeForce 4 claim 1.24 claim 1.2

TeraOpTeraOp/sec./sec.

Page 31: Large Scale Visualization using PC Clusters · Large Scale Visualization using PC Clusters ... 200 time steps ... “Cluster to Wall with VTK,” Parallel and Large Data Volume

Leveraging Leveraging GPUsGPUs

Non visualization algorithms onNon visualization algorithms on GPUsGPUs??

Using the GPU as a ‘coUsing the GPU as a ‘co--processor’.processor’.

BLAS library: Dense Dense Matrix Multiply BLAS library: Dense Dense Matrix Multiply (DGEMM).(DGEMM).

FFT calculationsFFT calculations

““FFT on a GPUFFT on a GPU,” ,” SIGGRAPH/SIGGRAPH/EurographicsEurographicsWorkshop on Graphics Hardware 2003Workshop on Graphics Hardware 2003..

Page 32: Large Scale Visualization using PC Clusters · Large Scale Visualization using PC Clusters ... 200 time steps ... “Cluster to Wall with VTK,” Parallel and Large Data Volume

GAToRGAToR

Implements the ‘Projected Tetrahedra’ algorithm Implements the ‘Projected Tetrahedra’ algorithm (Shirley & (Shirley & TuchmanTuchman) within the micro code of an ) within the micro code of an NVidia GPU.NVidia GPU.

Moves all of the following functions from the CPU Moves all of the following functions from the CPU the GPU.the GPU.

Transform to screen space.Determine projection class.Calculate thick vertex location.Determine depth at thick vertex.Compute color and opacity for thick vertexApply exponential attenuation texture

GGpupu AAccelerated ccelerated TTetrahedraletrahedral RRendererenderer

Page 33: Large Scale Visualization using PC Clusters · Large Scale Visualization using PC Clusters ... 200 time steps ... “Cluster to Wall with VTK,” Parallel and Large Data Volume

model with millions of cells

Visibility Sort

graphics card

PC (CPU)

for each cell in order

compute cell’s screen

projection

decompose to triangles

find thickest cell distance

compute each triangle’s parameters

final image of model

SoftwareProgrammable

Hardware

GPU: Computational ResourceGPU: Computational Resource

CPU GPU Cell Contribution

Page 34: Large Scale Visualization using PC Clusters · Large Scale Visualization using PC Clusters ... 200 time steps ... “Cluster to Wall with VTK,” Parallel and Large Data Volume

Vertex Program ConstraintsVertex Program Constraints

Each instance of a vertex shader program works independently on a single vertex in SIMD fashion.

No support dynamic vertex creation or topology modification within the vertex program.

No branching (at the time…now supported)

No knowledge of neighboring vertices.

Cannot change execution based on past information.

Constraints seem insurmountable but we devised some very clever workarounds!

Page 35: Large Scale Visualization using PC Clusters · Large Scale Visualization using PC Clusters ... 200 time steps ... “Cluster to Wall with VTK,” Parallel and Large Data Volume

Clever constraint workClever constraint work--aroundaround

V1’V4’

V0’

V3’

V2’

Basis GraphIsomorphic to all projection cases

Programmable vertex shaders do not support dynamic vertex creation or topology modification within the vertex program.

Page 36: Large Scale Visualization using PC Clusters · Large Scale Visualization using PC Clusters ... 200 time steps ... “Cluster to Wall with VTK,” Parallel and Large Data Volume

Isomorphic Property of Basis GraphIsomorphic Property of Basis Graph

Page 37: Large Scale Visualization using PC Clusters · Large Scale Visualization using PC Clusters ... 200 time steps ... “Cluster to Wall with VTK,” Parallel and Large Data Volume

Meticulous Code Review

Just Kidding!

This shows what the code looks like.

Page 38: Large Scale Visualization using PC Clusters · Large Scale Visualization using PC Clusters ... 200 time steps ... “Cluster to Wall with VTK,” Parallel and Large Data Volume

Results: Test DatasetsResults: Test Datasets

Dataset Vertices Tetrahedra

Blunt Fin 40,960 187,395

Oxygen Post 109,744 513,375

Delta Wing 211,680 1,005,675

Dataset GPU time Constant

Tets/s GPU timeLinear

Tets/s

Blunt Fin 0.20 sec 937 K 0.38 sec 493 K

Oxygen Post 0.55 sec 933 K 1.04 sec 493 K

Delta Wing 1.07 sec 940 K 2.03 sec 495 K

Dataset Info

Timings

Page 39: Large Scale Visualization using PC Clusters · Large Scale Visualization using PC Clusters ... 200 time steps ... “Cluster to Wall with VTK,” Parallel and Large Data Volume

Gratuitous PicturesGratuitous Pictures

““Tetrahedral Projection using VertexTetrahedral Projection using Vertex ShadersShaders””, , IEEE IEEE Volume VisualizationVolume Visualization, Boston, Massachusetts, Boston, Massachusetts, , 2002.2002.

Page 40: Large Scale Visualization using PC Clusters · Large Scale Visualization using PC Clusters ... 200 time steps ... “Cluster to Wall with VTK,” Parallel and Large Data Volume

Parallel Volume RenderingParallel Volume Rendering

Ongoing work with OGI (Oregon Graduate Ongoing work with OGI (Oregon Graduate Institute).Institute).Approach: Leverage distributed data Approach: Leverage distributed data

techniques for techniques for structuredstructured data and apply to data and apply to unstructuredunstructured data.data.

Structured data can be partitioned into Structured data can be partitioned into convexconvexsub domains.sub domains.

Page 41: Large Scale Visualization using PC Clusters · Large Scale Visualization using PC Clusters ... 200 time steps ... “Cluster to Wall with VTK,” Parallel and Large Data Volume

Higher Order ElementsHigher Order Elements

Typical approximationTypical approximation CorrectCorrect

Page 42: Large Scale Visualization using PC Clusters · Large Scale Visualization using PC Clusters ... 200 time steps ... “Cluster to Wall with VTK,” Parallel and Large Data Volume

The The InterpolantInterpolant

Parameters (r,s,t) specify field as a weighted sum of Parameters (r,s,t) specify field as a weighted sum of nodal values:nodal values:

Shape functions, Shape functions, NNii, are tensor products of Lagrange , are tensor products of Lagrange interpolants (for our example):interpolants (for our example):

∑=

=n

iii tsrNtsr

1),,(),,( φφ

)()()(),,( 321 tMsMrMtsrNi =

−−

−−=

node far midnode axis-origin near

)12()1(4

)12)(1()(

uu

u

uuuu

uuuM j

Page 43: Large Scale Visualization using PC Clusters · Large Scale Visualization using PC Clusters ... 200 time steps ... “Cluster to Wall with VTK,” Parallel and Large Data Volume

Rendering The GeometryRendering The Geometry

OpenGL OpenGL TesselatorsTesselators–– Overkill for small (screenOverkill for small (screen--space) elementsspace) elements

Adaptive TriangulationAdaptive Triangulation–– Fast (Fast (VelhoVelho et al., Chung et al.)et al., Chung et al.)

Page 44: Large Scale Visualization using PC Clusters · Large Scale Visualization using PC Clusters ... 200 time steps ... “Cluster to Wall with VTK,” Parallel and Large Data Volume

ResultsResults

““Rendering Higher Order Finite Element Surfaces in HardwareRendering Higher Order Finite Element Surfaces in Hardware””, , Proceedings of GRAPHITEProceedings of GRAPHITE, Melbourne Australia, February 2003., Melbourne Australia, February 2003.

Unlit

Lit

Page 45: Large Scale Visualization using PC Clusters · Large Scale Visualization using PC Clusters ... 200 time steps ... “Cluster to Wall with VTK,” Parallel and Large Data Volume

Advanced rendering for Advanced rendering for Sci VisSci Vis

Sandia Project PerceptSandia Project PerceptUtilize Utilize GPUsGPUs to provide realistic to provide realistic perceptual clues to general perceptual clues to general scientific visualizations. Perceptual scientific visualizations. Perceptual clues include shadows, reflections, clues include shadows, reflections, refractions, depth perception, and refractions, depth perception, and realistic lighting models. realistic lighting models.

Phase 1:Phase 1: vtkShadowRenderervtkShadowRenderer is an is an easy to use module that provides easy to use module that provides real time shadows forreal time shadows for vtkvtkapplications.applications.

Page 46: Large Scale Visualization using PC Clusters · Large Scale Visualization using PC Clusters ... 200 time steps ... “Cluster to Wall with VTK,” Parallel and Large Data Volume

VTK ConstraintsVTK Constraints

Shadow techniques are focused on GamesShadow techniques are focused on Games•• Often preprocess the geometryOften preprocess the geometry•• No self shadowingNo self shadowing•• Small amounts of geometrySmall amounts of geometry

Applying shadows to scientific Applying shadows to scientific vis vis (VTK)(VTK)•• Needs to work with VTKNeeds to work with VTK•• No No accessaccess to geometryto geometry•• Needs to be small ‘footprint’Needs to be small ‘footprint’•• Must have ‘self’ shadowMust have ‘self’ shadow•• Must work on large amounts of geometryMust work on large amounts of geometry•• Shadow Mapping best option for these constraintsShadow Mapping best option for these constraints

DemoDemo

Page 47: Large Scale Visualization using PC Clusters · Large Scale Visualization using PC Clusters ... 200 time steps ... “Cluster to Wall with VTK,” Parallel and Large Data Volume

PicturesPictures

Page 48: Large Scale Visualization using PC Clusters · Large Scale Visualization using PC Clusters ... 200 time steps ... “Cluster to Wall with VTK,” Parallel and Large Data Volume

PicturesPictures

Page 49: Large Scale Visualization using PC Clusters · Large Scale Visualization using PC Clusters ... 200 time steps ... “Cluster to Wall with VTK,” Parallel and Large Data Volume

PicturesPictures

Page 50: Large Scale Visualization using PC Clusters · Large Scale Visualization using PC Clusters ... 200 time steps ... “Cluster to Wall with VTK,” Parallel and Large Data Volume

ResultsResults

•• Works with unmodified VTKWorks with unmodified VTK•• Really simple to Add to VTK appReally simple to Add to VTK app

////vtkRenderervtkRenderer **renren == vtkRenderervtkRenderer::New();::New();vtkShadowRenderervtkShadowRenderer **renren == vtkShadowRenderervtkShadowRenderer::New();::New();

•• Next release of ParaView will include shadow optionNext release of ParaView will include shadow option

•• PerformancePerformance•• 4 datasets tested, 4 datasets tested, avgavg 1.7x Slowdown 1.7x Slowdown

•• How can this be less than 2x???How can this be less than 2x???•• Only writing to depth bufferOnly writing to depth buffer•• No lightingNo lighting

•• Using Using glCopyTexSubImage2D(GL_TEXTURE_RECTANGLE_NV, …);glCopyTexSubImage2D(GL_TEXTURE_RECTANGLE_NV, …);•• Very Fast! (cow at 30 Very Fast! (cow at 30 hz hz no problem)no problem)

Page 51: Large Scale Visualization using PC Clusters · Large Scale Visualization using PC Clusters ... 200 time steps ... “Cluster to Wall with VTK,” Parallel and Large Data Volume

Performance Sidebar Performance Sidebar

Page 52: Large Scale Visualization using PC Clusters · Large Scale Visualization using PC Clusters ... 200 time steps ... “Cluster to Wall with VTK,” Parallel and Large Data Volume

Overclocker’s Overclocker’s on on GPUsGPUs

Page 53: Large Scale Visualization using PC Clusters · Large Scale Visualization using PC Clusters ... 200 time steps ... “Cluster to Wall with VTK,” Parallel and Large Data Volume

Overclocker’s Overclocker’s on CPUson CPUs

Page 54: Large Scale Visualization using PC Clusters · Large Scale Visualization using PC Clusters ... 200 time steps ... “Cluster to Wall with VTK,” Parallel and Large Data Volume

OverclockOverclock YourselfYourself

Page 55: Large Scale Visualization using PC Clusters · Large Scale Visualization using PC Clusters ... 200 time steps ... “Cluster to Wall with VTK,” Parallel and Large Data Volume

Recent Publications (Recap)Recent Publications (Recap)

““Scalable Rendering on PC ClustersScalable Rendering on PC Clusters”” Computer Graphics and Computer Graphics and ApplicationsApplications, Large Data Visualization Issue, July/Aug 2001., Large Data Visualization Issue, July/Aug 2001.

““SortSort--Last Tiled Rendering for Viewing Extremely Large Last Tiled Rendering for Viewing Extremely Large Datasets on Tiled DisplaysDatasets on Tiled Displays””, , IEEE Parallel and Large Data IEEE Parallel and Large Data Visualization and Graphics, Visualization and Graphics, San Diego, CA, 2001.San Diego, CA, 2001.

““Tetrahedral Projection using VertexTetrahedral Projection using Vertex ShadersShaders””, , IEEE Volume IEEE Volume VisualizationVisualization, Boston, Massachusetts, Boston, Massachusetts, , 2002.2002.

““Rendering Higher Order Finite Element Surfaces in HardwareRendering Higher Order Finite Element Surfaces in Hardware””, , Proceedings of GRAPHITEProceedings of GRAPHITE, Melbourne Australia, Feb 2003., Melbourne Australia, Feb 2003.

““Cluster to Wall with VTKCluster to Wall with VTK,” ,” Parallel and Large Data Volume Parallel and Large Data Volume Graphics, 2003Graphics, 2003..

““FFT on a GPUFFT on a GPU,” ,” SIGGRAPH/SIGGRAPH/EurographicsEurographics Workshop on Workshop on Graphics Hardware 2003Graphics Hardware 2003..

Page 56: Large Scale Visualization using PC Clusters · Large Scale Visualization using PC Clusters ... 200 time steps ... “Cluster to Wall with VTK,” Parallel and Large Data Volume

ENDEND