Upload
qualcomm-developer-network
View
261
Download
5
Embed Size (px)
DESCRIPTION
Qualcomm® Snapdragon™ processors, a product of Qualcomm Technologies, Inc., boast a long list of technologies, from the CPU and GPU, to audio, video, display, networking and much more. In this session, you’ll learn how to take advantage of these features and technologies to create the best gaming experiences, including all the available tools. Watch this presentation on YouTube: https://www.youtube.com/watch?v=NhbZK_5na7U&list=PLxeazpXYyqtNm2EnCbfSzy7aKOkHjiaSi&index=31 Learn more about developing mobile apps for devices powered by Snapdragon processors: https://developer.qualcomm.com/mobile-development/maximize-hardware/mobile-gaming-graphics-adreno
Citation preview
1 ©2013-2014 Qualcomm Technologies, Inc. and/or its affiliated companies. All Rights Reserved.
Qualcomm® Snapdragon™ Processors: A Super Gaming Platform
Manish Sirdeshmukh, Product Manager, Staff Todd LeMoine, Engineer, Principal/Manager Qualcomm Technologies, Inc.
Qualcomm Snapdragon is a product of Qualcomm Technologies, Inc.
3 ©2013-2014 Qualcomm Technologies, Inc. and/or its affiliated companies. All Rights Reserved.
Source: Gartner, October 2013, “Forecast Video Game Ecosystem Worldwide”
Total mobile gaming revenues (for all platforms) are projected to grow from $13 billion in 2013 to $22 billion in 2015
$ 22B
4 ©2013-2014 Qualcomm Technologies, Inc. and/or its affiliated companies. All Rights Reserved.
Gaming on mobile today
Comparison: PC Comparison: Mobile
5 ©2013-2014 Qualcomm Technologies, Inc. and/or its affiliated companies. All Rights Reserved.
Gaming on mobile today
Desktop PC Snapdragon 805
“Epic now has brought Unreal Engine 4 to Android with the Snapdragon 800 and 805 chipsets from Qualcomm Technologies,” said Niklas Smedberg, Senior Engine Programmer, Epic Games. “Recently we worked with Qualcomm [QTI] to elevate graphics to the next level on the Snapdragon Adreno GPU hardware, which delivers some of the most power-efficient unified shader capabilities we’ve seen yet for Android smartphones and tablets.”
Comparison: PC Comparison: Mobile
6 ©2013-2014 Qualcomm Technologies, Inc. and/or its affiliated companies. All Rights Reserved.
Image: Modern Combat 5 by Gameloft
What is involved in games?
Gameplay execution (animation): Animation for water movement and anchored boat motion
Gameplay execution (AI): Enemy helicopter controlled by AI
Gameplay execution (physics): Particle physics makes explosions look real
Console-quality graphics: Lens effect on the sunlight breaking through the clouds
Console-quality graphics: Hi-res textures provide rich details to the scene
Console-quality graphics: Bloom glare from gun fire provide immersive experience
Fast connectivity: Play a mission in multi-player gaming
High-quality video: After completing the level, watch a cut scene transition
Responsive and accurate control: Control the character movement
Multi-screen experience: Mirror your screen to TV
Cinema-quality sound: Hear gunfire, explosions, bullets flying by, and the helicopter’s rotor blades
7 ©2013-2014 Qualcomm Technologies, Inc. and/or its affiliated companies. All Rights Reserved.
Snapdragon processors
8 ©2013-2014 Qualcomm Technologies, Inc. and/or its affiliated companies. All Rights Reserved.
Snapdragon processors
9 ©2013-2014 Qualcomm Technologies, Inc. and/or its affiliated companies. All Rights Reserved.
How is SoC utilized by a game? Heterogeneous hardware blocks and data flow
Graphics Textures, Shaders, Geometry
Video Data
Audio Data
Start
Quad Core CPU
System Memory Final Frame
CPU #1 CPU #2 CPU #3 CPU #4
Phys
ics
Ani
mat
ion
Gam
e lo
gic
Art
ifici
al
Inte
llige
nce To Display Panel
To Wi-Fi Display Panel
Encoded Final Frame
Input Signals
Display Reads G
PU Reads
Video
Graphics Rendering
Aud
io
Gra
phic
s Pi
xel W
rite
s
Video Pixel Writes
To Speakers
Wi-Fi Engine
Video Decoder
Video Encoder
DSP (Audio Decoder)
Sensor Engine
Display Engine
GPU
10 ©2013-2014 Qualcomm Technologies, Inc. and/or its affiliated companies. All Rights Reserved.
Qualcomm Gobi, Qualcomm Adreno, Qualcomm Hexagon and Krait are products of Qualcomm Technologies, Inc.
Qualcomm® Adreno™ GPU
• Adreno is Qualcomm Technologies, Inc.’s (QTI) integrated GPU
• Adreno 420 is QTI’s latest integrated GPU shipping in Snapdragon 805
• Adreno GPUs are custom designed for mobile use
Qualcomm® Krait™ 450 Quad Core CPU
Location GPS, GLONASS, Beidou, Galileo Satellites
Adreno 420 GPU OpenGL ES 2.0/3.1*
OpenCL 1.2 Full
Snapdragon Display Engine 4K, Miracast, picture enhancement
Dual ISPs (Imaging)
Up to 55MP 1.2GPix/s bw Camera SW
USB 3.0
Multimedia Processing
4K Decode HEVC Decode
Snapdragon Voice Activation Gestures
Studio Access Security
Memory 2x64 bit LPDDR3
Qualcomm® Hexagon™ DSP Ultra Low Power Sensor Engine
Fusion 4.5
Fusion 4.5
Qualcomm® Gobi™ 9x35 Modem
4th gen CAT 6 LTE Up to 3x20MHz CA
*Product is based on provisional Khronos Specification, and is designed to pass the Khronos Conformance Testing Process when available. Current conformance status can be found at www.khronos.org/comformance.
11 ©2013-2014 Qualcomm Technologies, Inc. and/or its affiliated companies. All Rights Reserved.
Adreno 420 GPU highlights
• Desktop and console quality graphics on mobile
• Complete DirectX11 FL 11_2 pipeline, supports OpenGL ES 3.1
• Support for dynamic hardware tessellation & geometry shaders
Richer, visually immersive graphics
No Tessellation Tessellation
12 ©2013-2014 Qualcomm Technologies, Inc. and/or its affiliated companies. All Rights Reserved.
Adreno 420 supports most advanced graphics APIs
Feature/APIs OpenGL ES 3.0 OpenGL ES 3.1 Android Extension Pack
Compute Shader No Yes Yes
Atomics No Yes Yes
Image Load/Store No Yes Yes
Draw Indirect No Yes Yes
Texture Gather No Yes Yes
Multisample Textures No Yes Yes
Stencil Textures No Yes Yes
Separate Shader Objects No Yes Yes
Advanced Blending Modes (Programmable Blending)
No Yes Yes
Geometry Shaders No No Yes
Tessellation Shaders No No Yes
13 ©2013-2014 Qualcomm Technologies, Inc. and/or its affiliated companies. All Rights Reserved.
ASTC Unified Shaders FlexRender™ technology
FlexRender is a product of Qualcomm Technologies, Inc.
Adreno 420 GPU highlights
• Improved architecture for performance & efficiency
• Better performance
• Reduced power consumption
Dir
ect
Rend
erin
g T
iled
Rend
erin
g
Dynamic Switching
Original ASTC Compression
24bpp 8bpp 3.56bpp 2bpp
Unified Shaders
Pixel Vertex
Compute Tessellation Geometry
Adreno GPU
System memory
Tile buffer
Adreno GPU
System memory
14 ©2013-2014 Qualcomm Technologies, Inc. and/or its affiliated companies. All Rights Reserved.
Adreno 420 architectural improvements
• DX11.2 3D pipeline − Hardware tessellation
− Geometry shading
− Stream out from VS, DS, GS
− Programmable blending
• Upgraded compute − Direct compute, OpenCL 1.2 Full profile
− Faster RenderScript
• Improved texturing − Improved texture performance
− Support for higher level texture filtering (e.g., Aniso) with less performance impact
− ASTC support, better LOD & filtering quality
− Larger caches: texture cache, L2 cache
• Improved ROPs & Z − Faster depth rejection
− Designed to achieve peak draw rate more often
System Memory Command Processor
(Input Assembler)
Vertex Shader
Hull Shader (LOD, Control Patch)
Tessellator
Domain Shader (Vertex Calculation
& Displacement)
Geometry Shader
Rasterizer
Pixel Shader
Render Backend
Index Buffers
Hardware Tessellation
Pipeline
Vertex Buffers
Constant Buffers
Unordered Access
Resources
Texture Resources
Render Targets
Textures
Buffers
Unified Shader
Processor
Frame Buffer
Stream Out
15 ©2013-2014 Qualcomm Technologies, Inc. and/or its affiliated companies. All Rights Reserved.
Adreno GPU architecture
Advantages:
• Designed to minimize unnecessary data traffic to host memory
• Designed to minimize power consumption
• Use of transparency / anti-aliasing is inexpensive
Tiled Rendering architecture Early Z (Depth) Reject feature
Objects in background
Objects in foreground
Advantages:
• Designed to prevent unnecessary use of GPU resources in drawing pixels for occluded objects
• Designed to increase overall graphics performance for larger scenes with opaque geometry
16 ©2013-2014 Qualcomm Technologies, Inc. and/or its affiliated companies. All Rights Reserved.
Adreno GPU architecture
Dynamic FlexRender technology Double Rate Half Precision (DRHP) design
Adreno GPU
System memory
Direct rendering
GMEM (Tile Buffer)
Adreno GPU
System memory
Tiled rendering
FlexRender
Dynamic Switching 1X
Speed for “highp” Shaders
2X Speed for
“mediump” Shaders
Advantages:
• Better performance and power for wider range of use cases
• More developer flexibility
Advantages:
• Use additional/complex shaders without compromising performance
• Better performance with power efficiency
17 ©2013-2014 Qualcomm Technologies, Inc. and/or its affiliated companies. All Rights Reserved.
OpenGL ES optimizations
18 ©2013-2014 Qualcomm Technologies, Inc. and/or its affiliated companies. All Rights Reserved.
Optimization: frame buffer objects Worst case pattern of FBO usage
Frame buffer
Clear Draw
FBO 0
Draw
Frame buffer
Draw
Store Store
Load Load
Store
Frame rendering
19 ©2013-2014 Qualcomm Technologies, Inc. and/or its affiliated companies. All Rights Reserved.
Optimization: frame buffer objects Optimized render order
Frame Buffer
Clear Draw
FBO 0
Store
Invalidate Framebuffer Draw
Store
Optimal rendering order: FBO0 invalidate, FBO0 draw … FBOn invalidate, FBOn draw, FB clear, FB draw
Frame rendering
20 ©2013-2014 Qualcomm Technologies, Inc. and/or its affiliated companies. All Rights Reserved.
Optimization: dynamic vertex buffer objects
• In the worst case the complete sequence of VBO updates and draw calls may have to be repeated for each bin
• Even when using glBufferSubData multiple copies of the entire VBO may need to be maintained by the driver
Worst case pattern of VBO usage
Update VBO0 Update VBO0 Update VBO0 Draw Draw Draw
Frame rendering
21 ©2013-2014 Qualcomm Technologies, Inc. and/or its affiliated companies. All Rights Reserved.
Optimization: dynamic vertex buffer objects Optimized dynamic VBO order
Update VBO0 Draw VBO0
Update VBO0 Update VBOn Draw VBO0 Draw VBOn
Or if multiple dynamic VBOs are used
Frame rendering
Frame rendering
22 ©2013-2014 Qualcomm Technologies, Inc. and/or its affiliated companies. All Rights Reserved.
Optimization: sorting
Potential to reduce both the number of state changes as well as overdraw - both of which have a negative impact on GPU performance
• Sort by material
− Reduces shader and texture state changes
• Sort opaque draw calls front-to back
− Reduces time spent shading fragments which will be overwritten later
− Have observed > 10ms/frame performance increase in some fragment bound content with just this optimization.
• Draw the skybox last
− Typically the skybox is covered by foreground geometry in half or more of the screen
23 ©2013-2014 Qualcomm Technologies, Inc. and/or its affiliated companies. All Rights Reserved.
Optimization: shader performance
Precision
• Operations on 16 bit floating point (mediump) values are 2x faster than on 32 bit (highp) − Recommend setting default precision to mediump and promoting only values which require higher
precision, E.g
Scalar architecture
• Adreno 3xx and 4xx GPUs utilize a scalar architecture
• Avoid using components that aren’t needed for the final result
• Wherever possible re-order operations to execute on as few components as possible
precision mediump float; // Set default precision in FS to fp16
out vec2 vSmallTexCoord; // Uses mediump out highp vec2 vLargeTexCoord; // Uses highp
24 ©2013-2014 Qualcomm Technologies, Inc. and/or its affiliated companies. All Rights Reserved.
Optimization: tessellation
Tessellation allows for incredible levels of detail and can substantially reduce memory bandwidth and CPU cycles by allowing other game sub-systems to operate on low resolution representations of meshes, but …
• High levels of tessellation can generate sub-pixel triangles which cause poor rasterizer utilization
− Very important to utilize distance, screen space size or other adaptive metrics for computing tessellation factors which avoid sub-pixel triangles
Full Rasterizer Utilization Partial Rasterizer Utilization
25 ©2013-2014 Qualcomm Technologies, Inc. and/or its affiliated companies. All Rights Reserved.
Optimization: tessellation
Culling
• Hardware back-face culling occurs after the tessellation stage, which potentially wastes GPU resources tessellating back facing primitives
• Back-facing primitives can be identified in the TCS and culled by setting their edge tessellation factors to 0
− A slight “fudge” factor may be needed in this calculation if displacement mapping will be used in the TES as this technique may change the visibility of primitives
General
• Whenever possible disable the TCS and TES stages if the tessellation factor for the mesh would be ~1
− Eliminates the use of unnecessary GPU stages
26 ©2013-2014 Qualcomm Technologies, Inc. and/or its affiliated companies. All Rights Reserved.
Adreno tools
27 ©2013-2014 Qualcomm Technologies, Inc. and/or its affiliated companies. All Rights Reserved.
Graphics content development & tools
Asset Creation
Compress/ Optimize
Code Emulate Compile Deploy Analyze/ Debug
28 ©2013-2014 Qualcomm Technologies, Inc. and/or its affiliated companies. All Rights Reserved.
Adreno SDK and Adreno Profiler and products of Qualcomm Technologies, inc.
Adreno tools
• Support for OpenGL ES 3.1, 3.0 & 2.0, DirectX, and OpenCL
• Supported on Windows, Mac OSX, and Linux
• Comprehensive collection of utilities
• Over 100 samples and tutorials
• Thorough documentation
Adreno SDK
Available on developer.qualcomm.com
Adreno Profiler
• Comprehensive profiling tool
• Supported on Windows, Mac OSX, and Linux
• Enables detailed analysis of GPU utilization
• Proven effective and easy to use
• Works with commercial devices & apps
29 ©2013-2014 Qualcomm Technologies, Inc. and/or its affiliated companies. All Rights Reserved.
Adreno Profiler: introduction
Grapher mode: real-time analysis Scrubber mode : detailed frame analysis
API call stack
Optimization suggestions
Shader stats
Shader editor
Texture browser
Detailed frame stats
Overrides
Metrics
Frame emulation
Scrubber metrics
30 ©2013-2014 Qualcomm Technologies, Inc. and/or its affiliated companies. All Rights Reserved.
Adreno Profiler demo
Reign of Amira™ Available on GooglePlay Reign of Amira is a product of Qualcomm Technologies, Inc.
31 ©2013-2014 Qualcomm Technologies, Inc. and/or its affiliated companies. All Rights Reserved.
Adreno SDK
• Desktop OpenGL ES emulator − Now supporting OpenGL ES 3.1
• Over 100 samples and tutorials − Simple tutorials to advanced demos
− Covers OpenGL ES 2.0 and 3.0, DirectX, and OpenCL
• Utilities and libraries − Texture compression
− Mesh optimization
• Adreno texture tool
• Developer documentation − Adreno Developer Guide
Shader samples
Animal materials (fur, elephant skin, fish scales, alligators, etc.)
General lighting (ambient, diffuse, specular, Blinn-Phong, parallax, etc.)
Human materials (skin, eye, etc.) Other effects (environment mapping, warping, glass distortion, god rays, etc.)
Other materials (cloth, wood, plastic, marble, leather, metal, etc.)
Advanced rendering (toon shading, deferred lighting, eye adaption, etc.)
32 ©2013-2014 Qualcomm Technologies, Inc. and/or its affiliated companies. All Rights Reserved.
Adreno SDK demo
Reign of Amira™ Available on GooglePlay
33 ©2013-2014 Qualcomm Technologies, Inc. and/or its affiliated companies. All Rights Reserved.
Special thanks
34 ©2013-2014 Qualcomm Technologies, Inc. and/or its affiliated companies. All Rights Reserved.
For more information on Qualcomm, visit us at: www.qualcomm.com & www.qualcomm.com/blog
©2013-2014 Qualcomm Technologies, Inc. and/or its affiliated companies. All Rights Reserved. Qualcomm, Snapdragon, Adreno, Gobi, Hexagon, FlexRender and Reign of Amira are trademarks of Qualcomm Incorporated, registered in the United States and other countries. Krait and Uplinq are trademarks of Qualcomm Incorporated. All Qualcomm Incorporated trademarks are used with permission. Other products and brand names may be trademarks or registered trademarks of their respective owners. References in this presentation to “Qualcomm” may mean Qualcomm Incorporated, Qualcomm Technologies, Inc., and/or other subsidiaries or business units within the Qualcomm corporate structure, as applicable. Qualcomm Incorporated includes Qualcomm’s licensing business, QTL, and the vast majority of its patent portfolio. Qualcomm Technologies, Inc., a wholly-owned subsidiary of Qualcomm Incorporated, operates, along with its subsidiaries, substantially all of Qualcomm’s engineering, research and development functions, and substantially all of its product and services businesses, including its semiconductor business, QCT.
Thank you FOLLOW US ON: