Upload
others
View
5
Download
0
Embed Size (px)
Citation preview
4/30/2007
1
Geometry
Real-Time Graphics Architecture
Kurt Akeley
Pat Hanrahan
http://graphics.stanford.edu/courses/cs448-07-spring/
Outline
Vertex and primitive operations
SGI implementations
Reyes architecture
Primitive generation (geometry shader)
CS448 Lecture 7 Kurt Akeley, Pat Hanrahan, Spring 2007
4/30/2007
2
Readings
Required
High-Performance Polygon Rendering, Akeley and Jermoluk Proceedings of SIGGRAPH ’88and Jermoluk, Proceedings of SIGGRAPH 88.
RealityEngine Graphics, Akeley, Proceedings of SIGGRAPH ‘93.
InfiniteReality: A Real-Time Graphics System, Montrym, Baum, Dignam, and Migdal, Proceedings of SIGGRAPH ’97.
CS448 Lecture 7 Kurt Akeley, Pat Hanrahan, Spring 2007
Recommended
OpenGL Specification
Direct3D 10 Specification
Modern Graphics Pipeline
Application
d
Geometry
Rasterization
TextureAlready covered
Today’s lecture
Command
CS448 Lecture 7 Kurt Akeley, Pat Hanrahan, Spring 2007
Fragment
Display
Already covered
4/30/2007
3
Geometry Processing
Two types of operations
Vertex operationsd d lOperate on individual vertexes
Primitive operationsOperate on all the vertexes of a primitive
CS448 Lecture 7 Kurt Akeley, Pat Hanrahan, Spring 2007
Engineering Challenges
Correctness
Invariance
Consistency (e.g. frame to frame)
Parallelism
Dynamic load balancing due to variable amount of work (rasterization, clipping, generation, …)
Ordering constraints in the graphics pipeline
CS448 Lecture 7 Kurt Akeley, Pat Hanrahan, Spring 2007
Managing state
4/30/2007
4
Vertex Stage
Vertex Operations
Transform coordinates and normal
Model world (not rigid)
World eye (rigid)
Normalize the length of the normal
Compute vertex lighting
Transform texture coordinates
Generate if so specified
CS448 Lecture 7 Kurt Akeley, Pat Hanrahan, Spring 2007
Generate if so specified
Transform coordinates to clip coordinates (projection)
Divide coordinates by w
Apply affine viewport transform (x, y, and z)
4/30/2007
5
Primitive Stage
Primitive Operations
Primitive assembly
Clipping
Backface cull
Transform coordinates and normal
Model world
World eyeBackface cullNormalize the length of the normal
Compute vertex lighting
Transform texture coordinates
Transform to clip coordinates
Assemble vertexes into primitives
Clip primitives against frustum
CS448 Lecture 7 Kurt Akeley, Pat Hanrahan, Spring 2007
Divide by w
Apply affine viewport transform
Eliminate back-facing triangles
4/30/2007
6
Primitive Assembly
Assemble based on application commands
Independent or strip or mesh or ….
Decompose polygons/quads into triangles
Prior to clipping to maintain invariance
Algorithm properties
Fixed execution time (good)All vertex operations up to this point have this property
CS448 Lecture 7 Kurt Akeley, Pat Hanrahan, Spring 2007
p p p p p y
Independent vertexes become independent primitives (bad)
Clipping
Two types
Point, line: eliminates geometry
Polygon: eliminates and introduces edges
Line Segments Polygon
CS448 Lecture 7 Kurt Akeley, Pat Hanrahan, Spring 2007
4/30/2007
7
Clipping
Two types
Point, line: eliminates geometry
Polygon: eliminates and introduces edgesPolygon: eliminates and introduces edges
Invariance requirements
Pre-decomposition to triangles
Canonical order of vertexes for edge calc.
Algorithm properties
CS448 Lecture 7 Kurt Akeley, Pat Hanrahan, Spring 2007
Vertex interdependencies (bad)
Data-dependent execution (worse)Variable execution time (substantially different)
Variable code paths
Backface Cull
Facet facing toward or away from viewpoint?
No facet normal (other APIs?)
Use sign of primitive’s window coordinate “area”Use sign of primitive s window coordinate areaRemember, only triangles are planar
Use facing direction to
Select lighting result (for n or –n)
Potentially discard the primitive
Move earlier to avoid work?
x2,y2
CS448 Lecture 7 Kurt Akeley, Pat Hanrahan, Spring 2007
(x0y1 - x1y0) + (x1y2 - x2y1) + (x2y0 - x0y2)
2Triangle area =
x0,y0 x1,y1
4/30/2007
8
SGI Generations
Engineering Challenges
Correctness
Invariance
Consistency
Parallelism
Dynamic load balancing
Ordering constraints in the graphics pipeline
CS448 Lecture 6 Kurt Akeley, Pat Hanrahan, Spring 2007
4/30/2007
9
Some Examples
Systems
Clark Geometry Engine (1983)
Silicon Graphics GTX (1988)
Silicon Graphics RealityEngine (1992)
Silicon Graphics InfiniteReality (1996)
Modern GPU (2001)
What we’ll look at
CS448 Lecture 7 Kurt Akeley, Pat Hanrahan, Spring 2007
What we ll look at
Organization of the geometry system
Distribution of vertex and primitive operations
How clipping affects the implementation
Clark Geometry Engine (1983)
Simple, fixed-function pipeline
Twelve identical enginesCoordinate
Transform
Implemented dot4
Soft-configured at start-up
Clipping allocated ½ of total ‘power’
Performance invariant (good)
Typically idle (bad)
6-plane
Frustum
Clipping
CS448 Lecture 7 Kurt Akeley, Pat Hanrahan, Spring 2007
Typically idle (bad)
Divide by w
Viewport
4/30/2007
10
GTX Geometry Engine (1988)
Variable functionality pipeline
5 identical engines
Coord, normal
Transform
Modes alter some functionsLoad balancing is difficult
Clipping allocated 1/5 of ‘power’
Clip testing is always done
Actual clipping is seldom done
Lighting
Clip testing
Clipping state
Divide by w
CS448 Lecture 7 Kurt Akeley, Pat Hanrahan, Spring 2007
pp gSlow pipeline execution (bad)
out of balance
Typically isn’t invoked (good)
(clipping)
Viewport
Prim. Assy.
Backface cull
RealityEngine Geometry (1992)
Geometry processor
Eight identical engines
Handles strips of length n vertexesCommand
Variable-functionality MIMD organization; Why?Handles clipping
Strips of different lengths
Different types of primitives
Command processor
Splits large strips into n vertex strips
Introduces redundant work (2 extra vertexes)
Command Processor
Round-robin Aggregation
CS448 Lecture 7 Kurt Akeley, Pat Hanrahan, Spring 2007
Introduces redundant work (2 extra vertexes)
Good ‘static’ load balancing
Shadows per-vertex attribute state
Sends all attributes with each vertex
Broadcasts other state
Aggregation
4/30/2007
11
RealityEngine Clipping
Clipping introduces data-dependent (dynamic) load
Cannot be predicted by CPCommand
Dynamic load balance accomplished by:
Round-robin assignment
Large input and output FIFOsFor each geometry processor
Sized greater than (n) long/typical
Command Processor
Round-robin Aggregation
CS448 Lecture 7 Kurt Akeley, Pat Hanrahan, Spring 2007
Large work load per processorMinimize the long/typical ratio
Unlike pipeline processing
Aggregation
InfiniteReality Geometry (1996)
Geometry processor
Variable-functionality MIMD organization
Four identical (SIMD) enginesCommand
Handles strips of length n
Command processor
Splits large strips of primitives
Least-busy work assignment
Good ‘static’ load balancing
Shadows per-vertex attribute state
Command Processor
Sequence Token Aggregation
CS448 Lecture 7 Kurt Akeley, Pat Hanrahan, Spring 2007
p
Sends all attributes with each vertex
Broadcasts other state
Aggregation
Reorder output using sequence tokens
Aggregation
4/30/2007
12
InfiniteReality Clipping
Dynamic load balance accomplished by:
Least-busy assignmentCommand
Even larger FIFOs
Large work load per processor
Sequence token to reorder
Likelihood of clipping reduced by
Command Processor
Sequence Token Aggregation
CS448 Lecture 7 Kurt Akeley, Pat Hanrahan, Spring 2007
Likelihood of clipping reduced by
Guard-band clipping algorithm
Aggregation
Guard-Band Clipping Example
Rendered
Di d d
Viewport
Discarded
Clipped
CS448 Lecture 7 Kurt Akeley, Pat Hanrahan, Spring 2007
Guard-band
4/30/2007
13
Guard-Band Clipping
Expand clipping frustum beyond desired viewport
Near and far clipping are unchanged
Frustum clip only when necessaryFrustum clip only when necessary
Ideal triangle cases:
Discard if outside viewport, else
Render without clipping if inside frustum, elseRasterizer must scissor well for this to be efficient
Cli t i gl
CS448 Lecture 7 Kurt Akeley, Pat Hanrahan, Spring 2007
Clip trianglesThat cross both viewport and frustum boundaries
That cross the near or far clip-planes
Operation is imperfect, but conservative
Modern Geometry Engine
Homogeneous coordinate rasterization algorithm
Modify inside testl dEvaluate 3 edges equations
Evaluate 6 clipping plane equations\
Succeed if inside all
Fixed amount of work per fragment
Clipping doesn’t increase or decrease #triangles
CS448 Lecture 7 Kurt Akeley, Pat Hanrahan, Spring 2007
Triangle scan conversion using 2D homogenous coordinates, M. Olano, T. Greer, GH1997
Studying this algorithm would be a good project
4/30/2007
14
Primitive Generation Stage
REYES ArchitectureD3D 10 Geometry Shader
REYES
Renders Everything You Ever Saw (name by Carpenter)
CS448 Lecture 6 Kurt Akeley, Pat Hanrahan, Spring 2007
The Road to Point Reyes
Directed by R. Cook, LucasFilm 1983
4/30/2007
15
REYES PipelineModel
Bound
C ll?Cull?
Dice?Split
Grids
ShadeTextures
NCook,
Carpenter,
Catmull,
SIGGRAPH 87
CS448 Lecture 6 Kurt Akeley, Pat Hanrahan, Spring 2007
Sample
Visibility
Filter
Image
SIGGRAPH 87
Reyes Architecture Reasoning
Nyquist-sized micro-polygon
¼ pixel area (½ pixel size)
Micro-polygon replaces fragment as the low-level primitive
One geometry/shading/texturing calculation per micropolygon
‘Non-flat’ geometry requires tessellation
Tessellation free compared to shading
Shading before hiding
Hiding=rasterization+z-buffer+csg+transparency+motion+dof
CS448 Lecture 6 Kurt Akeley, Pat Hanrahan, Spring 2007
g g p y
Displacement maps (texture before sampling/hiding)
Split and dice along surface parameters (u, v)
Coherent geometry and texture (filter in texture-space)
4/30/2007
16
REYES Immediate-Mode Pipeline
Command
V t
Command
T fObj t Vertex
Rasterize
Fragment
Texture
Transform
Tessellate
Shade
TextureOpenGL REYES
Object
CS448 Lecture 6 Kurt Akeley, Pat Hanrahan, Spring 2007
Display
Rasterize
Display Image
Primitive Generation (GPUs)
Why add primitive generation stage?
Some important reasons:
Provide high-level primitives to applications
Shift computation from CPU to GPU
Reduce storage requirements
Perverse desire to complicate the GPU design ;-)
CS448 Lecture 7 Kurt Akeley, Pat Hanrahan, Spring 2007
Perverse desire to complicate the GPU design ; )
Highest-priority reason:
Reduce CPU to GPU data rate!
4/30/2007
17
Other Rate Reduction Techniques
Display lists
Work well for client-server on network
Works well for CPU-GPU host-graphics memory
Complexity in managing display lists
Geometric compression and decompression
Strips and meshes
Indexed triangle sets
CS448 Lecture 7 Kurt Akeley, Pat Hanrahan, Spring 2007
Indexed triangle sets
Vertex buffer objects
Geometry Compression (Deering ’95)
Geometry Primitive Generation Stage
d
Application
C d
Application
Here? Or Here?
Prim. Generate
Command
Vertex
Rasterization
T t
Vertex
Command
Prim.Generate
Rasterization
T t
CS448 Lecture 7 Kurt Akeley, Pat Hanrahan, Spring 2007
Texture
Fragment
Display
Texture
Fragment
Display
4/30/2007
18
Geometry Shader
Where to put this stage?
Requires primitives, thus after primitive assemblyassembly
Do not duplicate transforms on vertexes (for adaptive tessellation), thus after vertex stage
Assumes shading done in fragment stage
Assumes texturing can be done in vertex stage for displacement mapping
CS448 Lecture 7 Kurt Akeley, Pat Hanrahan, Spring 2007
p pp g
Geometry Shader
Input: One triangle or line + neighborhood
CS448 Lecture 7 Kurt Akeley, Pat Hanrahan, Spring 2007
Output: 0 to n vertexes defining new primitive
1. Input type may be different than output type
2. May output multiple strips of same output type
4/30/2007
19
Challenges
Enabling parallelism while enforcing order
Introduces dependency
Difficult to load balanceVariable amount of computation
Difficult to manage stateVariable amount of output
CS448 Lecture 7 Kurt Akeley, Pat Hanrahan, Spring 2007
Solution
Limited size output buffer1024 32-bit floats per invocation
Problems with Primitive Generation
Cracks and T-vertexes
Different surfaces must abuthBut arithmetic is of finite precision
Generally requires “stitching”
CS448 Lecture 7 Kurt Akeley, Pat Hanrahan, Spring 2007
4/30/2007
20
Stitching Patches Together
4x4 Patch 5x5 Patch
CS448 Lecture 7 Kurt Akeley, Pat Hanrahan, Spring 2007
Stitching Patches Together
4x4 Patch 5x5 Patch
CS448 Lecture 7 Kurt Akeley, Pat Hanrahan, Spring 2007
Stitching vertexes introduced
4/30/2007
21
Problems with Primitive Generation
Cracks and T-vertexes
Different surfaces must abuthBut arithmetic is of finite precision
Generally requires “stitching”
Parallel adaptive tessellation is hard (good project)
Required by REYES algorithm to match
CS448 Lecture 7 Kurt Akeley, Pat Hanrahan, Spring 2007
q y gtessellation rate with sampling rate
Themes
Correctness
Invariance
Consistency
Parallelism
Dynamic load balancing due to variable work
Ordering constraints in the graphics pipeline
Rasterization and generation are complex
CS448 Lecture 7 Kurt Akeley, Pat Hanrahan, Spring 2007
Rasterization and generation are complex
Evolution
Where to add the next stage?