Upload
others
View
13
Download
0
Embed Size (px)
Citation preview
April 4-7, 2016 | Silicon Valley
www.esi-group.com
Andreas Mank ([email protected])
Team Leader Visualization, ESI Group
Markus Tavenrath ([email protected])
Senior Developer Technology Engineer, NVIDIA
04/04/2016
ESI RENDERING INNOVATIONS WITH NVIDIA DESIGNWORKS™
2
“a pioneer and world-leading provider in Virtual Prototyping”
— www.esi-group.com/
3
Virtual Engineering
Dynamic scenesIntuitive interaction
Immersion
Reliable behaviorHigh quality
Everytime and everywhereDistributed hardware
On demand
4
HELIOS BY
High Performance Rendering
Remote Rendering Interactive Ray Tracing
Physically-Based RenderingAbstract Material Definition
Hybrid Rendering
5
“Tools and technologies for
professional graphics and advanced
rendering applications.”
https://developer.nvidia.com/designworks
6
DESIGNWORKS™Tools and Technologies in Virtual Engineering
MDL SDK / VMATERIALS
7
DEMO: MATERIAL CONSISTENCY WITH MDL
RAY TRACERRASTERIZER
8
DESIGNWORKS™Tools and Technologies in Virtual Engineering
MDL SDK / VMATERIALS OPTIX
9
GLOBAL ILLUMINATION WITH OPTIX
10
DEMO: INTERACTIVE GI WITH VCAS
11
DESIGNWORKS™Tools and Technologies in Virtual Engineering
MDL SDK / VMATERIALS OPTIX
GRID SDK
12
PLATFORM AS A SERVICE WITH GRID
13
DESIGNWORKS™Tools and Technologies in Virtual Engineering
MDL SDK / VMATERIALS OPTIX
GRID SDK NVPRO-PIPELINE
14
NVPRO-PIPELINE
Modern OpenGL features
Modern shader features with GLSL
Not CPU-bound with shaders
Not CPU-bound with complex scene graphs
Efficient updates for dynamic geometries
Benefits in Virtual Engineering
15
DEMO: DYNAMIC MESH ANIMATIONS WITH RIX
SceniX RiX
2 FPS 20 FPS
16
PERFORMANCE
20
360
60
120
0
50
100
150
200
250
300
350
Dynamic Nodes Materials Fixed
Chart Title
SceniX
RiX
Fra
mes
per
second
x 10
x 6
x 3
x 1.1
17
DESIGNWORKSNV PRO Pipeline
4/4/2016
Low-end
Mid-range
High-end
Quadro
Framerate
Low-end
Mid-range
High-end
Quadro
FramerateCPU load
GPU load
18
A LOOK INTO THE PASTSceniX 7 RENDERER DESIGN
SceniX 7 used a dirty bit/renderlist cache scheme for rendering
4/4/2016
G0
T0 T1
T2
S1 S2
G1
T3
S0
M0 M1 material layer
T0 T2 T3 transform layer
S0 S1 S2 geometry layer
-> full rebuild
We had a few cases where rebuilt could be avoidedIncremental updates were still not fast enough
19
A LOOK INTO THE PASTRenderer bottlenecks
4/4/2016
Profiling revealed multiple bottlenecks in renderer
for (material : materials) // HashMap -> pointer chasing
if (cam->isVisible(material)) // virtual function call -> pointer chasing
process material(); // virtual function call
for (transform : material.transforms) // HashMap
if (cam->isVisible(transform)) // virtual function call
process Transform(); // virtual function call
for (shape : transform.shapes) // HashMap
if (cam->isVisible(shape) // virtual function fall (20% time)
process(shape) // switch(OC) -> branch misprediction
20
NV PRO PIPELINE
SceniX 6 -> SceniX 7 got up to 6x faster each interation if drawcall limited
Still so many bottlenecks in our SceneGraph rendering
Our partners like ESI needed just a fast renderer, not a SceneGraph
SceneGraph->SceneGraph->Rendering worked mostly out
Took a lot of resources and wasted CPU time due to the additional layer
Research platform was required how to resolve all those bottlenecks
NV PRO Pipeline was born
Focus on CPU efficient rendering without any compatibility restriction4/4/2016
21
NV PRO PIPELINERiX::GL
Developers who want to write an OpenGL renderer face one problem:
OpenGL has a million ways to do the same thing, what‘s the best way?
Parameters
Uniforms, UBOs, SSBOs
Geometry
immediate mode, display list, vbo/ibo, vao, vab
4/4/2016
Bindless
Bindless
Combinatorial explosion
22
NV PRO PIPELINERiX::GL
4/4/2016
How to abstract all the differences in an efficient way?
S0
T0 M0
S1
T2 M1
S2
T3 M1
S2
T3 M1
S1
T2‘ M1
Monitor
S2
S1
S1
S2
S0
render(group of objects)
render(group of objects, order)
RiXAPI to abstract rendering of groups of objects
23
NV PRO PIPELINESceneGraph
4/4/2016
SceneGraph
[dp::sg]
RiX
[dp::rix]How to get from SceneGraph to group with incremental updates?
G0
T0 T1
T2
S1 S2
G1
T3
S0
Referenced twice
S0
T0 M0
S1
T2 M1
S2
T3 M1
S2
T3 M1
S1
T2‘ M1
24
PIPELINESceneGraph
4/4/2016
SceneGraph
[dp::sg]
RiX::GL
[dp::rix::gl]
G0
T0 T1
T2
S1 S2
G1
T3
S0
SceneTree
[dp::sg::xbar]
Renderer
[dp::sg::rdr::rix::gl]
G0
T0 T1
T2
S1 S2
G1
T3
S0
T2‘
S1‘ S2‘
G1‘
T3‘
Events TranslateS1
T2
S2
T3
S0
T0
S1‘
T1‘S2‘
T2‘
Needs to be done by your application if not using reference SceneGraph
S1
T2
S2
T3
S0
T0
S1‘
T1‘S2‘
T2‘
Events
Events are fully incremental
25
NV PRO PIPELINEMaterial System
Basic pipeline ready
SceneGraph -> group of objects -> RiX
Next step: Support for GLSL
Problem: uniforms, ubos, ssbos, different GLSL versions
all required a different shader header
We needed a material system, independent from SceneGraph
4/4/2016
26
NV PRO PIPELINEMaterial System
Material system [dp::fx] was born
Interface allows enumeration of
Materials (shader pipelines) and corresponding set of parameter groups
Allows multiple backends in parallel
XML (public), MDL (on request)
Material system can generate shaders for all parameter techniques
Uniforms, UBOs, SSBOs, -> write shader only once
4/4/2016
27
NV PRO PIPELINEResults
Efficient pipeline with another ~6x speedup over SceniX 7 for draw-call limited scenes
Achieves 6-7mio drawcalls/s on 2.4Ghz system when using bindless
Started with new features
Frustum culling
TransformTree extraction from SceneTree
4/4/2016
SceneGraph
[dp::sg]
RiX::GL
[dp::rix::gl]
SceneTree
[dp::sg::xbar]
Renderer
[dp::sg::rdr::rix::gl]
28
NV PRO PIPELINEFrustum Culling
Frustum culling is important to reduce #draw calls per frame
Don‘t render hidden objects
NV PRO Pipeline has efficient frustum culling system (10k objects get culled in ~100us)
works on groups and returns delta since last call
-> don‘t process unchanged data
[dp::culling] is the module
4/4/2016
29
NV PRO PIPELINETransform Tree
4/4/2016
TransformTree is responsible to compute work transform for each object
Currently tighly bound to xbar which translates from SceneGraph to Renderer
Working on TransformTree as indepdenent module
Currently ~15M transforms/s on CPU and up to 300M transforms/s on GPU
For more information visit my Talk:
S6131 - Nvpro-Pipeline: Handling Massive Transform Updates in a SceneGraph
Tuesday, 14:30 – 14:55
30
NV PRO PIPELINEResults
NV PRO PIPELINE is our open source research rendering pipeline, it‘s not a product
Demonstrates techniques to reduce CPU cost of rendering
Shows that big speedups are possible when leaving traditional SceneGraph traversal
ESI proof that the concepts do work in real world applications
Working on modularization so that even more modules can be used in other projects
Interested? Grab your copy here:
https://developer.nvidia.com/nvidia-pro-pipeline
4/4/2016
April 4-7, 2016 | Silicon Valley
www.esi-group.com
THANK YOU
https://developer.nvidia.com/nvidia-pro-pipeline
32
HELIOS - CURRENTLY
4/4/201604.04.2016
HELIO
S
ICIDORiX::GL
Transform
Viewer
Back-Ends
RASTERIZER
RAY TRACERVCA
OptiX
Culling
Multi-Cast
33
HELIOS - WHAT’S NEXT?
4/4/201604.04.2016
HELIO
SICIDO
VRify
MDL SDK COMPOSITER
GRID SDK
dp::fxOPENGL
OPTIX
VULKAN