Upload
others
View
33
Download
0
Embed Size (px)
Citation preview
Traditional Graphics Pipeline
CPU
ApplicationApplicationVerticesVertices
(3D)(3D)
DisplayList
PolynomialEvaluator
Per VertexOperations &
PrimitiveAssembly
RasterizationPer Fragment
OperationsFrameBuffer
TextureMemory
PixelOperations
XformLighting
ProjectionClipping
etc
XformLighting
ProjectionClipping
etc
Traditional Graphics Pipeline
CPU
ApplicationApplicationVerticesVertices
(3D)(3D)
DisplayList
PolynomialEvaluator
Per VertexOperations &
PrimitiveAssembly
RasterizationPer Fragment
OperationsFrameBuffer
TextureMemory
PixelOperations
Traditional Graphics Pipeline
A simplified graphics pipeline– Note that pipe widths vary
– Many caches, FIFOs, and so on not shown
GPUCPU
ApplicationApplication TransformTransform RasterizerRasterizer ShadeShade VideoMemory
(Textures)
VideoMemory
(Textures)VerticesVertices
(3D)(3D)XformedXformed,,
LitLitVerticesVertices
(2D)(2D)
FragmentsFragments(pre(pre--pixels)pixels)
FinalFinalpixelspixels
(Color, Depth)(Color, Depth)
Graphics StateGraphics State
RenderRender--toto--texturetexture
Traditional Graphics Pipeline
A simplified graphics pipeline– Note that pipe widths vary
– Many caches, FIFOs, and so on not shown
GPUCPU
ApplicationApplication TransformTransform RasterizerRasterizer ShadeShade VideoMemory
(Textures)
VideoMemory
(Textures)VerticesVertices
(3D)(3D)XformedXformed,,
LitLitVerticesVertices
(2D)(2D)
FragmentsFragments(pre(pre--pixels)pixels)
FinalFinalpixelspixels
(Color, Depth)(Color, Depth)
Graphics StateGraphics State
RenderRender--toto--texturetexture
Modern Graphics Pipeline
Programmable vertex processor!
Programmable pixel processor!
GPUCPU
ApplicationApplication VertexProcessor
VertexProcessor RasterizerRasterizer Pixel
ProcessorPixel
ProcessorVideo
Memory(Textures)
VideoMemory
(Textures)VerticesVertices
(3D)(3D)XformedXformed,,
LitLitVerticesVertices
(2D)(2D)
FragmentsFragments(pre(pre--pixels)pixels)
FinalFinalpixelspixels
(Color, Depth)(Color, Depth)
Graphics StateGraphics State
RenderRender--toto--texturetexture
VertexProcessor
VertexProcessor
FragmentProcessorFragmentProcessor
Using Programmability~2000-2002: ASM Now: C-like!!VP1.0## c[0-3] = modelview projection (composite) matrix# c[4-7] = modelview inverse transpose# c[32] = eye-space light direction# c[33] = eye-space half-angle vector (infinite viewer)# c[35].x = diffuse light * mat.# c[35].y = ambient light * mat.# c[36] = specular color# c[38].x = specular power# outputs homogenous position and color# DP4 o[HPOS].x, c[0], v[OPOS]; # Compute position.DP4 o[HPOS].y, c[1], v[OPOS];DP4 o[HPOS].z, c[2], v[OPOS];DP4 o[HPOS].w, c[3], v[OPOS];DP3 R0.x, c[4], v[NRML]; # Compute normal.DP3 R0.y, c[5], v[NRML]; DP3 R0.z, c[6], v[NRML]; # R0 = N' = transformed normalDP3 R1.x, c[32], R0; # R1.x = Ldir DOT N'DP3 R1.y, c[33], R0; # R1.y = H DOT N'MOV R1.w, c[38].x; # R1.w = specular powerLIT R2, R1; # Compute lighting valuesMAD R3, c[35].x, R2.y, c[35].y; # diffuse + ambientMAD o[COL0].xyz, c[36], R2.z, R3; # + specularEND
vertout main(appin IN,
uniform float4x4 ModelViewProj,uniform float4x4 ModelViewIT,uniform float3 lightVec,uniform float3 halfVec,uniform float3 diffuseMaterial,uniform float3 ambientCol,uniform float3 specularMaterial,uniform float specexp){
vertout OUT; //struct w/ HPosition, Color
OUT.HPosition = mul(ModelViewProj,IN.Position);
float3 normalVec = normalize(mul(ModelViewIT,IN.Normal).xyz);
float diffuse = dot(normalVec, lightVec);float spec = dot(normalVec, halfVec);
float4 lighting = lit(diffuse,spec,specexp);OUT.Color.rgb = lighting.y * diffuseMaterial+ambientCol + lighting.z * specularMaterial;OUT.Color.a = 1.0;return OUT;}
Traditional OpenGL Pipeline
CPU
ApplicationApplicationVerticesVertices
(3D)(3D)
DisplayList
PolynomialEvaluator
Per VertexOperations &
PrimitiveAssembly
RasterizationPer Fragment
OperationsFrameBuffer
TextureMemory
PixelOperations
Vertex Proc. CapabilitiesLighting, material and geometry flexibility
Vertex processor replaces the following:– Vertex transformation
– Normal transformation & normalization
– Lighting
– Color material application
– Clamping of colors
– Texture coordinate generation
– Texture coordinate transformation
The vertex shader does NOT replace:– Perspective divide and viewport mapping
– Frustum and user clipping
– Backface culling
– Primitive assembly
– Two-sided lighting selection
– Polygon offset
– Polygon mode
Vertex Proc. Capabilities
Vertices: What You Get (old)
Vertex Attributes
Vertex Program
Vertex Output
Uniforms
Temporary Registers Read/Write-able
16x4 registers
128 ASM instructions
Same for all object vertices
96x4 registers
12x4 registers
Read-only
Color, position, tex coords,fog weight,etc.
Color, position, tex coords,etc.
Xform matrices,Light dir/ pos.Bone weights, etc.
Mult by persp.Modelview mtx
Compute color
Per-vtx Shade
NVidia GeForce 2,3
Vertices: What You Don’t Get
Connectivity (neighbor face, edge, vtx)
Can’t Create/Destroy Vertices (Geom/Tess)
Large Writable Memory
Vertices: Expensive (Slow!) Ops
Branches (if, for, while)
Large R/O Memory (textures)
Vertices: Workarounds
Connectivity (neighbor face, edge, vtx)– Encode neighbor info as attributes
Can’t Create/Destroy Vertices– Create: start w/ more than you need & specialize
– Destroy: move outside clip volume
Large Writable Memory– But fragments do (frame buffer)
Vertices: Efficiency
Branches (if, for, while)– (a<1)?b:c; unroll loops
Large R/O Memory (textures)– Can put small tables in uniform arrays
Vertex Programming Vertex Attributes
Attribute Register
Conventional per-vertex Attribute
Aliased Conventional
Command
Conventional Mapping
4 secondary color glSecondaryColorEXT r,g,b,15 Fog coordinate glFogCoordEXT fc,0,0,16 - - -7 - - -8 Texture coord 0 glTexCoord s,t,r,q9 Texture coord 1 glMultiTexCoord s,t,r,q10 Texture coord 2 glMultiTexCoord s,t,r,q11 Texture coord 3 glMultiTexCoord s,t,r,q
0 vertex position glVertex x,y,z,w1 vertex weights glVertexWeightEXT w,0,0,12 normal glNormal x,y,z,13 Primary color glColor r,g,b,a
14 Texture coord 6 glMultiTexCoord s,t,r,q15 Texture coord 7 glMultiTexCoord s,t,r,q
12 Texture coord 4 glMultiTexCoord s,t,r,q13 Texture coord 5 glMultiTexCoord s,t,r,q
Semantics defined by program NOT parameter name!
• Matrices can be “tracked”.
• Makes matrices automatically available in vertex program’s parameter registers
• MODELVIEW, PERSPECTIVE, TEXTUREi, and others can each be mapped to 4 program parameter registers
• Mapping can be IDENTITY, TRANSPOSE, INVERSE, or INVERSE_TRANSPOSE
Program Parameters
Assembly LanguageWhy do we care?
– Need to understand limitations in high level lang.
– EFFICIENCY!
– “basic” ASM capabilities (ex: negate) are “free”
– Understand the graphics HW
• SIMD instruction set
• Four operations simultaneously
• 17 instructions
• Operate on scalar or 4-vector input
• Result in a vector or replicated scalar output
Assembly Language
Assembly Language
Instruction Format:
Opcode dst, [-]s0 [,[-]s1 [,[-]s2]]; #comment
Instruction name
Destination Register
Source0 Register
Source1 Register
Source2 Register
Example:
MOV r1, r2
R1xyzw
R2xyzw
Assembly Language
Source registers undergo an input mapping before operation occurs…
• Negation
• Swizzling
• Smearing
Assembly Language
Source registers can be negated:
MOV R1, -R2;
before after
R1
x
y
z
w
0.0
0.0
0.0
0.0
R2
x
y
z
w
7.0
3.0
6.0
2.0
R1
x
y
z
w
-7.0
-3.0
-6.0
-2.0
R2
x
y
z
w
7.0
3.0
6.0
2.0
Assembly Language
Source registers can be “swizzled":
MOV R1, R2.yzwx;
before after
R1
x
y
z
w
0.0
0.0
0.0
0.0
R2
x
y
z
w
7.0
3.0
6.0
2.0
R1
x
y
z
w
3.0
6.0
2.0
7.0
R2
x
y
z
w
7.0
3.0
6.0
2.0
Assembly Language
Source registers can be negated and “swizzled":
MOV R1, -R2.yzzx;
before after
R1
x
y
z
w
0.0
0.0
0.0
0.0
R2
x
y
z
w
7.0
3.0
6.0
2.0
R1
x
y
z
w
-3.0
-6.0
-6.0
-7.0
R2
x
y
z
w
7.0
3.0
6.0
2.0
Assembly Language
Source registers can be swizzled by “smearing":
MOV R1, R2.w; # alternative to# using R2.wwww
before after
R1
x
y
z
w
0.0
0.0
0.0
0.0
R2
x
y
z
w
7.0
3.0
6.0
2.0
R1
x
y
z
w
2.0
2.0
2.0
2.0
R2
x
y
z
w
7.0
3.0
6.0
2.0
Assembly Language
Destination register can mask which components are written to…
R1 write all components
R1.x write only x component
R1.xw write only x, w components
Assembly Language
Destination register masking:
MOV R1.xw, -R2;
before after
R1
x
y
z
w
0.0
0.0
0.0
0.0
R2
x
y
z
w
7.0
3.0
6.0
2.0
R1
x
y
z
w
-7.0
0.0
0.0
-2.0
R2
x
y
z
w
7.0
3.0
6.0
2.0
GeForce 3: 17 instructions in total …
• ARL [x]
• MOV• MUL• ADD• MAD a*b+c
• RCP 1/x
• RSQ norm
• DP3• DP4• DST• MIN• MAX
• SLT <
• SGE ≥
• EXP 2x
• LOG• LIT phong
The Instruction Set
The Instruction Set
What about more complex instructions?
Absolute Value: MAX R1, -R1;
Division: RCP; MUL
Matrix Transform: DP4; DP4; DP4; DP4
Cross-Product: MUL; MAD
Others…
Limitations~ GeForce 3
– No vertex textures (tables need to fit in uniforms)
– No branches
– Few uniforms (96x4), attributes(16x4), ops (128), registers (8), etc.
Latest & Greatest (as of 10/’04)– 1 vertex texture, but very expensive!
– Branches very slow (better to unroll)
Using Programmability~2000-2002: ASM Now: C-like!!VP1.0## c[0-3] = modelview projection (composite) matrix# c[4-7] = modelview inverse transpose# c[32] = eye-space light direction# c[33] = eye-space half-angle vector (infinite viewer)# c[35].x = diffuse light * mat.# c[35].y = ambient light * mat.# c[36] = specular color# c[38].x = specular power# outputs homogenous position and color# DP4 o[HPOS].x, c[0], v[OPOS]; # Compute position.DP4 o[HPOS].y, c[1], v[OPOS];DP4 o[HPOS].z, c[2], v[OPOS];DP4 o[HPOS].w, c[3], v[OPOS];DP3 R0.x, c[4], v[NRML]; # Compute normal.DP3 R0.y, c[5], v[NRML]; DP3 R0.z, c[6], v[NRML]; # R0 = N' = transformed normalDP3 R1.x, c[32], R0; # R1.x = Ldir DOT N'DP3 R1.y, c[33], R0; # R1.y = H DOT N'MOV R1.w, c[38].x; # R1.w = specular powerLIT R2, R1; # Compute lighting valuesMAD R3, c[35].x, R2.y, c[35].y; # diffuse + ambientMAD o[COL0].xyz, c[36], R2.z, R3; # + specularEND
vertout main(appin IN,
uniform float4x4 ModelViewProj,uniform float4x4 ModelViewIT,uniform float3 lightVec,uniform float3 halfVec,uniform float3 diffuseMaterial,uniform float3 ambientCol,uniform float3 specularMaterial,uniform float specexp){
vertout OUT; //struct w/ HPosition, Color
OUT.HPosition = mul(ModelViewProj,IN.Position);
float3 normalVec = normalize(mul(ModelViewIT,IN.Normal).xyz);
float diffuse = dot(normalVec, lightVec);float spec = dot(normalVec, halfVec);
float4 lighting = lit(diffuse,spec,specexp);OUT.Color.rgb = lighting.y * diffuseMaterial+ambientCol + lighting.z * specularMaterial;OUT.Color.a = 1.0;return OUT;}
Example Apps.
Custom transform, lighting, and skinning
Custom cartoon-style lighting
Example Apps.
• Per-vertex set up for per-pixel bump mapping
Example Apps.
• Character morphing & shadow volume projection
Example Apps.
• Dynamic displacements of surfaces by objects
Example Apps.