23
STEPHAN HODES DEVELOPER TECHNOLOGY ENGINEER, AMD INTRODUCING FIDELITYFX VARIABLE SHADING

STEPHAN HODES DEVELOPER TECHNOLOGY ENGINEER, AMD

  • Upload
    others

  • View
    9

  • Download
    0

Embed Size (px)

Citation preview

Page 1: STEPHAN HODES DEVELOPER TECHNOLOGY ENGINEER, AMD

STEPHAN HODES

DEVELOPER TECHNOLOGY ENGINEER, AMD

INTRODUCING FIDELITYFX VARIABLE SHADING

Page 2: STEPHAN HODES DEVELOPER TECHNOLOGY ENGINEER, AMD

AMD Public | Introducing FidelityFX Variable Shading | November 20202

INTRODUCTION TO VARIABLE RATE SHADING

• Variable Rate Shading (VRS) is a feature of DirectX®12 Ultimate

• Goal of VRS is to save GPU work (where it does not significantly contribute to the final frame)

• Games today are usually being played at very high resolution• Pixels are very small on screen

• Adjacent pixels often have similar color(if they belong to the same primitive)

• Postprocessing effects like Antialiasing, Depth of Field, or MotionBlurfurther reduce the difference between adjacent pixels

Page 3: STEPHAN HODES DEVELOPER TECHNOLOGY ENGINEER, AMD

AMD Public | Introducing FidelityFX Variable Shading | November 2020 33

Pixel

Quads

THE CONCEPT OF VRS

Page 4: STEPHAN HODES DEVELOPER TECHNOLOGY ENGINEER, AMD

AMD Public | Introducing FidelityFX Variable Shading | November 2020 44

• Without VRS, 4 pixel shader (PS) threads are getting generated for every quad of which at least one pixel is covered by a primitive

THE CONCEPT OF VRS

Pixel Quads

Sample position

Page 5: STEPHAN HODES DEVELOPER TECHNOLOGY ENGINEER, AMD

AMD Public | Introducing FidelityFX Variable Shading | November 2020 55

• Without VRS, 4 pixel shader (PS) threads are getting generated for every quad of which at least one pixel is covered by a primitive

• For every pixel where the sample-position is covered by the primitive, the result of the PS gets written to the render target

• Example:10 Quads/40 PS threads (27 active)

THE CONCEPT OF VRS

Pixel Quads

Sample position

Page 6: STEPHAN HODES DEVELOPER TECHNOLOGY ENGINEER, AMD

AMD Public | Introducing FidelityFX Variable Shading | November 2020 66

• Without VRS, 4 pixel shader (PS) threads are getting generated for every quad of which at least one pixel is covered by a primitive

• For every pixel where the sample-position is covered by the primitive, the result of the PS gets written to the render target

• Example:10 Quads/40 PS threads (27 active)

• With VRS one or multiple pixels form a coarse pixel (2x2 in this example)

THE CONCEPT OF VRS

Pixel

2x2 Coarse Pixel

Quad

Pixel center Sample pos.

Page 7: STEPHAN HODES DEVELOPER TECHNOLOGY ENGINEER, AMD

AMD Public | Introducing FidelityFX Variable Shading | November 2020 77

• Without VRS, 4 pixel shader (PS) threads are getting generated for every quad of which at least one pixel is covered by a primitive

• For every pixel where the sample-position is covered by the primitive, the result of the PS gets written to the render target

• Example:10 Quads/40 PS threads (27 active)

• With VRS one or multiple pixels form a coarse pixel (2x2 in this example)

• Example:4 Quads/16 PS threads (10 active)

• VRS only reduces shading quality within a triangle, the geometry edges are preserved

THE CONCEPT OF VRS

Pixel

2x2 Coarse Pixel

Quad

Pixel center Sample pos.

Page 8: STEPHAN HODES DEVELOPER TECHNOLOGY ENGINEER, AMD

AMD Public | Introducing FidelityFX Variable Shading | November 2020 88

• Without VRS, 4 pixel shader (PS) threads are getting generated for every quad of which at least one pixel is covered by a primitive

• For every pixel where the sample-position is covered by the primitive, the result of the PS gets written to the render target

• Example:10 Quads/40 PS threads (27 active)

• With VRS one or multiple pixels form a coarse pixel (2x2 in this example)

• Example:4 Quads/16 PS threads (10 active)

• VRS only reduces shading quality within a triangle, the geometry edges are preserved

Make sure to use centroid interpolation!

THE CONCEPT OF VRS

Pixel

2x2 Coarse Pixel

Quad

Pixel center Sample pos.

Page 9: STEPHAN HODES DEVELOPER TECHNOLOGY ENGINEER, AMD

AMD Public | Introducing FidelityFX Variable Shading | November 2020 99

VRS ON RDNA2

• VRS has multiple ways to control shading rate• Per drawcall (VRS Tier1)

• Per primitive (VRS Tier2, VS/GS output)

• Per screen tile (VRS Tier2, Image Based)• 8x8 pixels tile size

• Small tile size provides fine grained control

• Additional shading rates not supported• At standard resolutions 4x can hardly be used

without generating visual artifacts

• Additional shading rates make image generation more complex

• VRS Image gets copied into H-tile on bind• Small (but not neglectable) overhead

when binding the VRS image

• No overhead during rendering!

9

Pixel

Coarse Pixel

VRS Tile

2x2 shading 2x1 shading

1x2 shading 1x1 shading

Page 10: STEPHAN HODES DEVELOPER TECHNOLOGY ENGINEER, AMD

AMD Public | Introducing FidelityFX Variable Shading | November 2020 1010

10

DEMO

Page 11: STEPHAN HODES DEVELOPER TECHNOLOGY ENGINEER, AMD

AMD Public | Introducing FidelityFX Variable Shading | November 2020 1111

o Query Hardware details:

• Is VRS supported / which shading rates?

• Supporting 4x4 shading rate makes the image generation shader more complex

• 4x4 is likely to cause visible quality degradation at common resolutions

• Is Tier2 (Shader or Image Based VRS) supported?

• For image based: What is the tile size?

o Create VRS Image & generation shader

IMPLEMENTING VRS (INITIALIZATION)

11

void OnCreate(Device *pDevice, ResourceViewHeaps *pResourceViewHeaps, DynamicBufferRing *pConstantBufferRing, StaticBufferPool *pStaticBufferPool, DXGI_FORMAT overlayOutputFormat);

void OnDestroy();void OnCreateWindowSizeDependentResources(uint32_t w, uint32_t h);void OnDestroyWindowSizeDependentResources();

Page 12: STEPHAN HODES DEVELOPER TECHNOLOGY ENGINEER, AMD

AMD Public | Introducing FidelityFX Variable Shading | November 2020 1212

IMPLEMENTING VRS(RENDERING)• Compute VRS Image

• Bind VRS Image

• Set base shading rate and combiners

• Unbind VRS Image when done

• [Render VRS Image as overlay for debugging]

12

Combiner

Combiner

Final Rate

VRS Image

VS/GS Output

BaseShadingRate

void ComputeVrsMap(ID3D12GraphicsCommandList*, CBV_SRV_UAV*);void SetShadingRate(D3D12_SHADING_RATE,

const D3D12_SHADING_RATE_COMBINER*, ID3D12GraphicsCommandList*);

void StartVrsRendering(ID3D12GraphicsCommandList*);void EndVrsRendering(ID3D12GraphicsCommandList*);void DrawOverlay(ID3D12GraphicsCommandList*);

Page 13: STEPHAN HODES DEVELOPER TECHNOLOGY ENGINEER, AMD

AMD Public | Introducing FidelityFX Variable Shading | November 2020 1313

FIDELITYFX VARIABLE SHADING (CPP)

13

struct FFX_Variable_Shading_CB{

uint32_t width, height;uint32_t tileSize;float varianceCutoff;float motionFactor;

};

static void FFX_VariableShading_GetVrsImageResourceDesc(const uint32_t rtWidth, const uint32_t rtHeight, const uint32_t tileSize,CD3DX12_RESOURCE_DESC& VRSImageDesc);

static void FFX_VariableShading_GetDispatchInfo(const FFX_Variable_Shading_CB* cb,const bool useAditionalShadingRates,uint32_t& numThreadGroupsX, uint32_t& numThreadGroupsY)

Page 14: STEPHAN HODES DEVELOPER TECHNOLOGY ENGINEER, AMD

AMD Public | Introducing FidelityFX Variable Shading | November 2020 1414

FIDELITYFX VARIABLE SHADING (HLSL)

14

// Define: FFX_VARIABLE_SHADING_TILESIZE// Optional: FFX_VARIABLE_SHADING_ADDITIONALSHADINGRATES

// Constant Buffer cbuffer FFX_Variable_Shading_CB0 {

int2 g_Resolution;uint g_TileSize;float g_VarianceCutoff;float g_MotionFactor;

}

// Forward declaration of functions that need to be implemented// by shader code using this techniquefloat FFX_VariableShading_ReadLuminance(int2 pos);float2 FFX_VariableShading_ReadMotionVec2D(int2 pos);void FFX_VariableShading_WriteVrsImage (int2 pos, uint value);

Page 15: STEPHAN HODES DEVELOPER TECHNOLOGY ENGINEER, AMD

AMD Public | Introducing FidelityFX Variable Shading | November 2020 1515

FIDELITYFX VARIABLE SHADING (HLSL USAGE)

15

// define FFX_VARIABLE_SHADING_TILESIZE on compile!// may define FFX_VARIABLE_SHADING_ADDITIONALSHADINGRATESRWTexture2D<uint> imgDestination: register(u0);Texture2D texColor : register(t0);Texture2D texVelocity : register(t1);

#define FFX_HLSL 1#include "ffx_Variable_Shading.h"

float FFX_VariableShading_ReadLuminance(int2 pos) {float3 color = texColor[pos].xyz;return dot(color, float3(0.30, 0.59, 0.11));

}

float2 FFX_VariableShading_ReadMotionVec2D(int2 pos) {return texVelocity[pos].xy * float2(0.5f, -0.5f) * g_Resolution;

}

Page 16: STEPHAN HODES DEVELOPER TECHNOLOGY ENGINEER, AMD

AMD Public | Introducing FidelityFX Variable Shading | November 2020 1616

HOW THE SHADER WORKS

1. One threadgroup computes between 1 and 4 tiles• Without additional shading rates, and 8x8 tile size,

each group of 8x8 threads computes the shading rate for 4 tiles.

2. Analyze pairs of pixels within 2x2 region• Using LDS

• Each thread also takes pixels outside the 2x2 box into account. This avoids burn-in (i.e. Low VRS rate because it was low in last frame)

• Reduce luminance delta by motion influence

3. Compute largest luminance delta within tile1. Using wave intrinsics

4. Compare to threshold

5. Write out VRSImage

16

Coarse Pixel (1 thread)

VRS Tile

Area analyzed

by 1 thread

Page 17: STEPHAN HODES DEVELOPER TECHNOLOGY ENGINEER, AMD

AMD Public | Introducing FidelityFX Variable Shading | November 2020 1717

17

VRS OVERLAY

• Display VRSImage as overlay for Debugging tweaking

• Ready to use code in VrsOverlay.hlsl

• Easy to integrate:

1. Build VS/PS Pipeline and bind it

2. Provide constantbuffer containing resolution and tile size

(same as for VRS image generation)

3. Draw a single triangle

(No vertex or index buffers required)

Page 18: STEPHAN HODES DEVELOPER TECHNOLOGY ENGINEER, AMD

AMD Public | Introducing FidelityFX Variable Shading | November 2020

18

Page 19: STEPHAN HODES DEVELOPER TECHNOLOGY ENGINEER, AMD

AMD Public | Introducing FidelityFX Variable Shading | November 2020 1919

CAVEATSSince VRS works by reducing number of PS executions

• No benefit in depth/stencil only passes

• No benefit in fill rate bound scenarios

• No benefit in compute passes

• Very little benefit if average triangle size is very small (Think of quad utilization)

19

10 active PS

to shade 28 Pixels

Pixel

Shading region

Example: 2x2 shading rate

Page 20: STEPHAN HODES DEVELOPER TECHNOLOGY ENGINEER, AMD

AMD Public | Introducing FidelityFX Variable Shading | November 2020 2020

CAVEATSFeatures that cause shading rate to drop to 1x1

• Depth export

• Post-depth coverage

• Raster Ordered Access Views

• 16xAA

Minimize the number of times per frame the VRS Image gets bound or unbound!

o If VRS needs to get disabled for a few draw calls while the same depth buffer is being used, (e.g. to render alpha-tested geometry) the best practice is to leave the VRS image bound and disable VRS by modifying the combiners.

20

Pixel

Shading region

Example: 2x2 shading rate

10 active PS

to shade 28 Pixels

Page 21: STEPHAN HODES DEVELOPER TECHNOLOGY ENGINEER, AMD

AMD Public | Introducing FidelityFX Variable Shading | November 2020 2121

TAKEAWAY• Easy to integrate

• Free performance

• VRS preserves triangle edges

• Also depth/stencil information

• Experiments showed: 2x2 shading rate is ideal for commonly used resolutions

21

Pixel

Shading region

Example: 2x2 shading rate

10 active PS

to shade 28 Pixels

Page 22: STEPHAN HODES DEVELOPER TECHNOLOGY ENGINEER, AMD

AMD Public | Introducing FidelityFX Variable Shading | November 2020

DISCLAIMER

22

Gameplay footage captured from DIRT® 5 footage courtesy of Codemasters®. System configuration used: AMD Ryzen 9

3900X Twelve-Core Processor, Corsair Hydro Series, H100x, 240mm Radiator, Dual 120mm PWM Fans, Liquid CPU Cooler,

Gigabyte X570 Gaming X AMD AM4 X570 Chipset ATX Motherboard, Corsair 64GB Vengeance LPX DDR4 3200 MHz

RAM/Memory Kit 4x16GB, Corsair Force Series Gen.4 PCIe MP600 2TB NVME M.2 SSD w/ Heatsink, Samsung 860 Evo

1TB Solid State Drive/SSD, WD Blue 3TB 3.5" Desktop Hard Drive (HDD), EVGA SuperNOVA 850 G3 Power Supply,

Windows 10 OEM.

The information contained herein is for informational purposes only, and is subject to change without notice. While every

precaution has been taken in the preparation of this document, it may contain technical inaccuracies, omissions and

typographical errors, and AMD is under no obligation to update or otherwise correct this information. Advanced Micro

Devices, Inc. makes no representations or warranties with respect to the accuracy or completeness of the contents of this

document, and assumes no liability of any kind, including the implied warranties of noninfringement, merchantability or fitness

for particular purposes, with respect to the operation or use of AMD hardware, software or other products described herein.

No license, including implied or arising by estoppel, to any intellectual property rights is granted by this document. Terms and

limitations applicable to the purchase or use of AMD’s products are as set forth in a signed agreement between the parties or

in AMD's Standard Terms and Conditions of Sale.

© 2020 Advanced Micro Devices, Inc. All rights reserved. AMD, the AMD Arrow logo, Epyc, Radeon, RDNA, Ryzen, and

combinations thereof are trademarks of Advanced Micro Devices, Inc. DirectX and Xbox are registered trademarks of

Microsoft Corporation in the US and other jurisdictions. Other product names used in this publication are for identification

purposes only and may be trademarks of their respective companies.

Page 23: STEPHAN HODES DEVELOPER TECHNOLOGY ENGINEER, AMD

LEARN MORE AT GPUOPEN.COM

Contact Information:

[email protected]

Twitter: @GPUOpen