Upload
others
View
7
Download
0
Embed Size (px)
Citation preview
48th International Symposium on Microarchitecture
Ultra-Low Power Render-Based Collision Detection
for CPU/GPU Systems
Enrique de Lucas Pedro Marcuello Joan-Manuel Parcerisa Antonio González
Enrique de Lucas 1. Motivation 2
Mobile devices market is growing fast
2008 2009 2010 2011 2012 2013 20140
200
400
600
800
1000
1200
1400
Mill
ion
s o
f un
it s
hip
pe
d
Market
PC
Smartphone
Enrique de Lucas 1. Motivation 3
Mobile Systems
● Users demand realistic and complex graphics like in laptops and desktops
– Battery life is about 4 hours for GFXBench 3.0!1
– Heat dissipation
● “CPU, GPU and screen are the dominant energy consumers on a smartphone”2
● Graphics animation applications are quite popular
– Collision Detection is an important task
1. With an ARM Mali 400MP GPU, www.gfxbench.com
2. Mittal et al., Empowering Developers to Estimate App Energy Consumption, Aug. 2012.
Enrique de Lucas 4
Outline
1. Motivation
2. Collision Detection (CD)
3. Render-Based CD in the GPU
4. Results
5. Conclusions
Enrique de Lucas 2. Collision Detection 5
Collision Detection (CD)Frame i Frame i+2
CD identifies the contact points between objects
Bounding Volume
Contact!
Enrique de Lucas 2. Collision Detection 6
Bounding Volumes
false area = false collisions
False Collisions!
Collision
Bounding Volume Cuboid Convex Hull
Accuracy Low Medium
Computing Cost Low High
Enrique de Lucas 2. Collision Detection 7
Image-Based CD (IBCD)
● CD performed at pixel granularity● No Bounding Volumes● Higher Accuracy● Computing/Energy Cost
● CPU: huge● Our technique (GPU): tiny
Enrique de Lucas 2. Collision Detection 8
How does IBCD work?
Steps of Image-Based CD
Depth
Screen
P1
Enrique de Lucas 2. Collision Detection 9
How does IBCD work?
Depth
P1
ScreenPixels
Steps of Image-Based CD
Depth
Screen
P1
Enrique de Lucas 2. Collision Detection 10
How does IBCD work?
1) Project objects onto a plane (screen)
Depth
P1
ScreenPixels
Steps of Image-Based CD
Depth
Screen
P1
Enrique de Lucas 2. Collision Detection 11
How does IBCD work?
1) Project objects onto a plane (screen)
Depth
P1
ScreenPixels
Steps of Image-Based CD
Depth
Screen
P1
Enrique de Lucas 2. Collision Detection 12
How does IBCD work?
1) Project objects onto a plane (screen)
Depth
P1
ScreenPixels
Steps of Image-Based CD
Depth
Screen
P1
Enrique de Lucas 2. Collision Detection 13
How does IBCD work?
1) Project objects onto a plane (screen)
Depth
P1
ScreenPixels
Steps of Image-Based CD
Depth
Screen
P1
Enrique de Lucas 2. Collision Detection 14
List for P1
How does IBCD work?
1) Project objects onto a plane (screen)
2) Rasterize surface of theobjects and store depthsin a list
Depth
P1
ScreenPixels
Steps of Image-Based CD
Depth
Screen
P1
Enrique de Lucas 2. Collision Detection 15
List for P1
How does IBCD work?
1) Project objects onto a plane (screen)
2) Rasterize surface of theobjects and store depthsin a list
Depth
P1
ScreenPixels
Steps of Image-Based CD
Depth
Screen
P1
Enrique de Lucas 2. Collision Detection 16
List for P1
How does IBCD work?
1) Project objects onto a plane (screen)
2) Rasterize surface of theobjects and store depthsin a list
Depth
P1
ScreenPixels
1
Steps of Image-Based CD
Depth
Screen
1
P1
Enrique de Lucas 2. Collision Detection 17
List for P1
How does IBCD work?
1) Project objects onto a plane (screen)
2) Rasterize surface of theobjects and store depthsin a list
Depth
P1
ScreenPixels
1 7
Steps of Image-Based CD
Depth
Screen
1 7
P1
Enrique de Lucas 2. Collision Detection 18
List for P1
How does IBCD work?
1) Project objects onto a plane (screen)
2) Rasterize surface of theobjects and store depthsin a list
Depth
P1
ScreenPixels
1 7 5
Steps of Image-Based CD
Depth
Screen
1 5 7
P1
Enrique de Lucas 2. Collision Detection 19
List for P1
How does IBCD work?
1) Project objects onto a plane (screen)
2) Rasterize surface of theobjects and store depthsin a list
Depth
P1
ScreenPixels
1 7 5 2
Steps of Image-Based CD
Depth
Screen
1 2 5 7
P1
Enrique de Lucas 2. Collision Detection 20
List for P1
How does IBCD work?
1) Project objects onto a plane (screen)
2) Rasterize surface of theobjects and store depthsin a list
Depth
P1
ScreenPixels
1 7 5 2 6
Steps of Image-Based CD
Depth
Screen
1 2 5 6 7
P1
Enrique de Lucas 2. Collision Detection 21
List for P1
How does IBCD work?
1) Project objects onto a plane (screen)
2) Rasterize surface of theobjects and store depthsin a list
Depth
P1
ScreenPixels
1 7 5 2 6 8
Steps of Image-Based CD
Depth
Screen
1 2 5 6 7 8
P1
Enrique de Lucas 2. Collision Detection 22
List for P1
How does IBCD work?
1) Project objects onto a plane (screen)
2) Rasterize surface of theobjects and store depthsin a list
Depth
P1
ScreenPixels
1 7 5 2 6 8 3
Steps of Image-Based CD
Depth
Screen
1 2 3 5 6 7 8
P1
Enrique de Lucas 2. Collision Detection 23
List for P1
How does IBCD work?
1) Project objects onto a plane (screen)
2) Rasterize surface of theobjects and store depthsin a list
Depth
P1
ScreenPixels
1 7 5 2 6 8 3 4
Steps of Image-Based CD
Depth
Screen
1 2 3 4 5 6 7 8
P1
Enrique de Lucas 2. Collision Detection 24
List for P1
How does IBCD work?
1) Project objects onto a plane (screen)
2) Rasterize surface of theobjects and store depthsin a list
3) Sort values by depth
Depth
P1
ScreenPixels
1 7 5 2 6 8 3 4
1 2 5 83 4Sort 76
Steps of Image-Based CD
Depth
Screen
1 2 3 4 5 6 7 8
P1
Enrique de Lucas 2. Collision Detection 25
List for P1
How does IBCD work?
1) Project objects onto a plane (screen)
2) Rasterize surface of theobjects and store depthsin a list
3) Sort values by depth
4) Detect overlappingdepth-ranges
Depth
P1
ScreenPixels
1 7 5 2 6 8 3 4
1 2 5 83 4Sort 76
Steps of Image-Based CD
Depth-ranges
Depth
Screen
1 2 3 4 5 6 7 8
P1
Enrique de Lucas 2. Collision Detection 26
List for P1
How does IBCD work?
1) Project objects onto a plane (screen)
2) Rasterize surface of theobjects and store depthsin a list
3) Sort values by depth
4) Detect overlappingdepth-ranges
Depth
P1
ScreenPixels
1 7 5 2 6 8 3 4
1 2 5 83 4Sort 76
Steps of Image-Based CD
Depth-ranges
Depth
Screen
1 2 3 4 5 6 7 8
P1
Enrique de Lucas 2. Collision Detection 27
List for P1
How does IBCD work?
1) Project objects onto a plane (screen)
2) Rasterize surface of theobjects and store depthsin a list
3) Sort values by depth
4) Detect overlappingdepth-ranges
Depth
P1
ScreenPixels
1 7 5 2 6 8 3 4
1 2 5 83 4Sort 76
Steps of Image-Based CD
Depth-ranges
Depth
Screen
1 2 3 4 5 6 7 8
P1
Enrique de Lucas 2. Collision Detection 28
List for P1
How does IBCD work?
1) Project objects onto a plane (screen)
2) Rasterize surface of theobjects and store depthsin a list
3) Sort values by depth
4) Detect overlappingdepth-ranges
Depth
P1
ScreenPixels
1 7 5 2 6 8 3 4
1 2 5 83 4Sort 76
Steps of Image-Based CD
Depth-ranges
Depth
Screen
1 2 3 4 5 6 7 8
P1
Enrique de Lucas 2. Collision Detection 29
List for P1
How does IBCD work?
1) Project objects onto a plane (screen)
2) Rasterize surface of theobjects and store depthsin a list
3) Sort values by depth
4) Detect overlappingdepth-ranges
Depth
P1
ScreenPixels
1 7 5 2 6 8 3 4
1 2 5 83 4Sort 76
Steps of Image-Based CD
Depth-ranges
Depth
Screen
1 2 3 4 5 6 7 8
P1
Enrique de Lucas 2. Collision Detection 30
List for P1
How does IBCD work?
1) Project objects onto a plane (screen)
2) Rasterize surface of theobjects and store depthsin a list
3) Sort values by depth
4) Detect overlappingdepth-ranges
Depth
P1
ScreenPixels
1 7 5 2 6 8
Overlap!
3 4
1 2 5 83 4Sort 76
Steps of Image-Based CDCollision!
Depth-ranges
Depth
Screen
1 2 3 4 5 6 7 8
P1
Enrique de Lucas 2. Collision Detection 31
List for P1
How does IBCD work?
1) Project objects onto a plane (screen)
2) Rasterize surface of theobjects and store depthsin a list
3) Sort values by depth
4) Detect overlappingdepth-ranges
Depth
P1
ScreenPixels
1 7 5 2 6 8
Overlap!
3 4
1 2 5 83 4Sort 76
Steps of Image-Based CDCollision!
Depth-ranges
Depth
Screen
1 2 3 4 5 6 7 8
P1
Enrique de Lucas 2. Collision Detection 32
List for P1
How does IBCD work?
1) Project objects onto a plane (screen)
2) Rasterize surface of theobjects and store depthsin a list
3) Sort values by depth
4) Detect overlappingdepth-ranges
Depth
P1
ScreenPixels
1 7 5 2 6 8
Overlap!
3 4
1 2 5 83 4Sort 76
Steps of Image-Based CDCollision!
Already done for image rendering!
Depth-ranges
Depth
Screen
1 2 3 4 5 6 7 8
P1
Enrique de Lucas 33
Outline
1. Motivation
2. Collision Detection (CD)
3. Render-Based CD in the GPU
4. Results
5. Conclusion
Enrique de Lucas 3. Render-Based CD in the GPU 34
Vertex
processor
Memory
controller
Rasterizer
Geometry Stage Raster Stage
Programmable Fixed-Function
CD Integration into GPU pipeline
Vertices Triangles Fragments
RBCDunit
CollisionPoints
GPU Commands
New Hardware
Color
Buffer
Fragment
processorFragment
processorFragment
processorFragment
processor
CollisionableFragments
Image
Graphics Pipeline from 10,000 feet
Memory
Enrique de Lucas 3. Render-Based CD in the GPU 35
RBCD unit
ZEB buffer
Insertion
Sort
Z-Overlap
Test
1) Store fragments sorted by depth2) Detects collision points
...
List 1 / Line 1 (32 B)
List 2 / Line 2 (32 B)
List 256 / Line 256 (32 B)
...
Enrique de Lucas 3. Render-Based CD in the GPU 36
RBCD unit
ZEB buffer
Insertion
Sort
Z-Overlap
Test
1) Store fragments sorted by depth2) Detects collision points
...
ZEB buffer (on-chip array)1 list of fragments per every pixel
16x16 lists, 8 entries per list8 KB (16x16x8x32B)
List 1 / Line 1 (32 B)
List 2 / Line 2 (32 B)
List 256 / Line 256 (32 B)
...
Enrique de Lucas 3. Render-Based CD in the GPU 37
RBCD unit
ZEB buffer
Insertion
Sort
Z-Overlap
TestSurface Points
(Fragments)
1) Store fragments sorted by depth2) Detects collision points
...
ZEB buffer (on-chip array)1 list of fragments per every pixel
16x16 lists, 8 entries per list8 KB (16x16x8x32B)
List 1 / Line 1 (32 B)
List 2 / Line 2 (32 B)
List 256 / Line 256 (32 B)
...
Enrique de Lucas 3. Render-Based CD in the GPU 38
RBCD unit
ZEB buffer
Insertion
Sort
Z-Overlap
TestSurface Points
(Fragments)
1) Store fragments sorted by depth2) Detects collision points
Insertion
Sort
...
ZEB buffer (on-chip array)1 list of fragments per every pixel
16x16 lists, 8 entries per list8 KB (16x16x8x32B)
List 1 / Line 1 (32 B)
List 2 / Line 2 (32 B)
List 256 / Line 256 (32 B)
...
Enrique de Lucas 3. Render-Based CD in the GPU 39
RBCD unit
ZEB buffer
Insertion
Sort
Z-Overlap
TestSurface Points
(Fragments)
1) Store fragments sorted by depth2) Detects collision points
Z-Overlap
Test
...
ZEB buffer (on-chip array)1 list of fragments per every pixel
16x16 lists, 8 entries per list8 KB (16x16x8x32B)
List 1 / Line 1 (32 B)
List 2 / Line 2 (32 B)
List 256 / Line 256 (32 B)
...
Enrique de Lucas 3. Render-Based CD in the GPU 40
RBCD unit
ZEB buffer
Insertion
Sort
Z-Overlap
TestSurface Points
(Fragments)Collision Points
1) Store fragments sorted by depth2) Detects collision points
Z-Overlap
Test
...
ZEB buffer (on-chip array)1 list of fragments per every pixel
16x16 lists, 8 entries per list8 KB (16x16x8x32B)
List 1 / Line 1 (32 B)
List 2 / Line 2 (32 B)
List 256 / Line 256 (32 B)
...
Enrique de Lucas 3. Render-Based CD in the GPU 41
RBCD unit
ZEB buffer
Insertion
Sort
Z-Overlap
TestSurface Points
(Fragments)Collision Points
1) Store fragments sorted by depth2) Detects collision points
...
ZEB buffer (on-chip array)1 list of fragments per every pixel
16x16 lists, 8 entries per list8 KB (16x16x8x32B)
List 1 / Line 1 (32 B)
List 2 / Line 2 (32 B)
List 256 / Line 256 (32 B)
...
Enrique de Lucas 3. Render-Based CD in the GPU 42
Insertion Sort
...ZEB
2 6
...List-Register
3 New
<...<
<
<
...
...
Muxes
Comparators Read one list
CompareShift
Write new list
Enrique de Lucas 3. Render-Based CD in the GPU 43
Insertion Sort
...ZEB
2 6
...2 6List-Register
3 New
<...<
<
<
...
...
Muxes
Comparators Read one list
CompareShift
Write new list
Enrique de Lucas 3. Render-Based CD in the GPU 44
Insertion Sort
...ZEB
2 6
...2 6List-Register
3 New
<...<
<
<
...
...
Muxes
Comparators Read one list
CompareShift
632
Enrique de Lucas 3. Render-Based CD in the GPU 45
Insertion Sort
...ZEB
2 6
...2 6List-Register
3 New
<...<
<
<
...
...
Muxes
Comparators Read one list
CompareShift
Write new list
632
63
Enrique de Lucas 3. Render-Based CD in the GPU 46
Z-Overlap TestZEB
List-Register
Stack
...
=
=
=
...
Current
=
CurrentElement
Is FrontFace?
Push
Yes No
SearchFront-face
Hit0
Hit3
Hit2
...
Hit1 CollisionPair
Generator
FragmentsAbove Front-Face?
Is any fragmentabove it?
Collision!!!Generate
Collision Pairs
Yes No
SearchFront-Face
Enrique de Lucas 3. Render-Based CD in the GPU 47
Z-Overlap TestZEB
List-Register
Stack
...
=
=
=
...
Current
=
CurrentElement
Is FrontFace?
Push
Yes No
SearchFront-face
Hit0
Hit3
Hit2
...
Hit1 CollisionPair
Generator
FragmentsAbove Front-Face?
Is any fragmentabove it?
Collision!!!Generate
Collision Pairs
Yes No
SearchFront-Face
Enrique de Lucas 3. Render-Based CD in the GPU 48
Z-Overlap TestZEB
List-Register
Stack
...
=
=
=
...
Current
=
CurrentElement
Is FrontFace?
Push
Yes No
SearchFront-face
Hit0
Hit3
Hit2
...
Hit1 CollisionPair
Generator
FragmentsAbove Front-Face?
Is any fragmentabove it?
Collision!!!Generate
Collision Pairs
Yes No
SearchFront-Face
Enrique de Lucas 3. Render-Based CD in the GPU 49
Z-Overlap TestZEB
List-Register
Stack
...
=
=
=
...
Current
=
CurrentElement
Is FrontFace?
Push
Yes No
SearchFront-face
Hit0
Hit3
Hit2
...
Hit1 CollisionPair
Generator
FragmentsAbove Front-Face?
Is any fragmentabove it?
Collision!!!Generate
Collision Pairs
Yes No
Is FrontFace?
SearchFront-Face
Enrique de Lucas 3. Render-Based CD in the GPU 50
Z-Overlap TestZEB
List-Register
Stack
...
=
=
=
...
Current
=
CurrentElement
Is FrontFace?
Push
Yes No
SearchFront-face
Hit0
Hit3
Hit2
...
Hit1 CollisionPair
Generator
FragmentsAbove Front-Face?
Push
Is any fragmentabove it?
Collision!!!Generate
Collision Pairs
Yes No
SearchFront-Face
Enrique de Lucas 3. Render-Based CD in the GPU 51
Z-Overlap TestZEB
List-Register
Stack
...
=
=
=
...
Current
=
CurrentElement
Is FrontFace?
Push
Yes No
SearchFront-face
Hit0
Hit3
Hit2
...
Hit1 CollisionPair
Generator
FragmentsAbove Front-Face?
Is any fragmentabove it?
Collision!!!Generate
Collision Pairs
Yes No
SearchFront-Face
Enrique de Lucas 3. Render-Based CD in the GPU 52
Z-Overlap TestZEB
List-Register
Stack
...
=
=
=
...
Current
=
CurrentElement
Is FrontFace?
Push
Yes No
SearchFront-face
Hit0
Hit3
Hit2
...
Hit1 CollisionPair
Generator
FragmentsAbove Front-Face?
Is any fragmentabove it?
Collision!!!Generate
Collision Pairs
Yes No
Is FrontFace?
SearchFront-Face
Enrique de Lucas 3. Render-Based CD in the GPU 53
Z-Overlap TestZEB
List-Register
Stack
...
=
=
=
...
Current
=
CurrentElement
Is FrontFace?
Push
Yes No
SearchFront-face
Hit0
Hit3
Hit2
...
Hit1
0
0
0
1Collision
PairGenerator
FragmentsAbove Front-Face?
SearchFront-face
Is any fragmentabove it?
Collision!!!Generate
Collision Pairs
Yes No
SearchFront-Face
Enrique de Lucas 3. Render-Based CD in the GPU 54
Z-Overlap TestZEB
List-Register
Stack
...
=
=
=
...
Current
=
CurrentElement
Is FrontFace?
Push
Yes No
SearchFront-face
Hit0
Hit3
Hit2
...
Hit1
0
0
0
1
0
0
0
0Collision
PairGenerator
FragmentsAbove Front-Face?
Is any fragmentabove it?
Collision!!!Generate
Collision Pairs
Yes NoIs any fragmentabove it?
SearchFront-Face
Enrique de Lucas 3. Render-Based CD in the GPU 55
Z-Overlap TestZEB
List-Register
Stack
...
=
=
=
...
Current
=
CurrentElement
Is FrontFace?
Push
Yes No
SearchFront-face
Hit0
Hit3
Hit2
...
Hit1 CollisionPair
Generator
FragmentsAbove Front-Face?
Is any fragmentabove it?
Collision!!!Generate
Collision Pairs
Yes NoIs any fragmentabove it?
SearchFront-Face
Enrique de Lucas 3. Render-Based CD in the GPU 56
Z-Overlap TestZEB
List-Register
Stack
...
=
=
=
...
Current
=
CurrentElement
Is FrontFace?
Push
Yes No
SearchFront-face
Hit0
Hit3
Hit2
...
Hit1 CollisionPair
Generator
FragmentsAbove Front-Face?
Is any fragmentabove it?
Collision!!!Generate
Collision Pairs
Yes No
Is FrontFace?
SearchFront-Face
Enrique de Lucas 3. Render-Based CD in the GPU 57
Z-Overlap TestZEB
List-Register
Stack
...
=
=
=
...
Current
=
CurrentElement
Is FrontFace?
Push
Yes No
SearchFront-face
Hit0
Hit3
Hit2
...
Hit1 CollisionPair
Generator
FragmentsAbove Front-Face?
Push
Is any fragmentabove it?
Collision!!!Generate
Collision Pairs
Yes No
SearchFront-Face
Enrique de Lucas 3. Render-Based CD in the GPU 58
Z-Overlap TestZEB
List-Register
Stack
...
=
=
=
...
Current
=
CurrentElement
Is FrontFace?
Push
Yes No
SearchFront-face
Hit0
Hit3
Hit2
...
Hit1 CollisionPair
Generator
FragmentsAbove Front-Face?
Is any fragmentabove it?
Collision!!!Generate
Collision Pairs
Yes No
SearchFront-Face
Enrique de Lucas 3. Render-Based CD in the GPU 59
Z-Overlap TestZEB
List-Register
Stack
...
=
=
=
...
Current
=
CurrentElement
Is FrontFace?
Push
Yes No
SearchFront-face
Hit0
Hit3
Hit2
...
Hit1 CollisionPair
Generator
FragmentsAbove Front-Face?
Is any fragmentabove it?
Collision!!!Generate
Collision Pairs
Yes No
Is FrontFace?
SearchFront-Face
Enrique de Lucas 3. Render-Based CD in the GPU 60
Z-Overlap TestZEB
List-Register
Stack
...
=
=
=
...
Current
=
CurrentElement
Is FrontFace?
Push
Yes No
SearchFront-face
Hit0
Hit3
Hit2
...
Hit1 CollisionPair
Generator
FragmentsAbove Front-Face?
Push
Is any fragmentabove it?
Collision!!!Generate
Collision Pairs
Yes No
SearchFront-Face
Enrique de Lucas 3. Render-Based CD in the GPU 61
Z-Overlap TestZEB
List-Register
Stack
...
=
=
=
...
Current
=
CurrentElement
Is FrontFace?
Push
Yes No
SearchFront-face
Hit0
Hit3
Hit2
...
Hit1 CollisionPair
Generator
FragmentsAbove Front-Face?
Is any fragmentabove it?
Collision!!!Generate
Collision Pairs
Yes No
SearchFront-Face
Enrique de Lucas 3. Render-Based CD in the GPU 62
Z-Overlap TestZEB
List-Register
Stack
...
=
=
=
...
Current
=
CurrentElement
Is FrontFace?
Push
Yes No
SearchFront-face
Hit0
Hit3
Hit2
...
Hit1 CollisionPair
Generator
FragmentsAbove Front-Face?
Is any fragmentabove it?
Collision!!!Generate
Collision Pairs
Yes No
Is FrontFace?
SearchFront-Face
Enrique de Lucas 3. Render-Based CD in the GPU 63
Z-Overlap TestZEB
List-Register
Stack
...
=
=
=
...
Current
=
CurrentElement
Is FrontFace?
Push
Yes No
SearchFront-face
Hit0
Hit3
Hit2
...
Hit1
0
0
1
0Collision
PairGenerator
FragmentsAbove Front-Face?
SearchFront-face
Is any fragmentabove it?
Collision!!!Generate
Collision Pairs
Yes No
SearchFront-Face
Enrique de Lucas 3. Render-Based CD in the GPU 64
Z-Overlap TestZEB
List-Register
Stack
...
=
=
=
...
Current
=
CurrentElement
Is FrontFace?
Push
Yes No
SearchFront-face
Hit0
Hit3
Hit2
...
Hit1
0
0
1
0Collision
PairGenerator
FragmentsAbove Front-Face?
SearchFront-face
Is any fragmentabove it?
Collision!!!Generate
Collision Pairs
Yes NoIs any fragmentabove it?
SearchFront-Face
Enrique de Lucas 3. Render-Based CD in the GPU 65
Z-Overlap TestZEB
List-Register
Stack
...
=
=
=
...
Current
=
CurrentElement
Is FrontFace?
Push
Yes No
SearchFront-face
Hit0
Hit3
Hit2
...
Hit1
0
0
1
0
0
1
0
0Collision
PairGenerator
FragmentsAbove Front-Face?
SearchFront-face
Is any fragmentabove it?
Collision!!!Generate
Collision Pairs
Yes NoIs any fragmentabove it?
SearchFront-Face
Enrique de Lucas 3. Render-Based CD in the GPU 66
Z-Overlap TestZEB
List-Register
Stack
...
=
=
=
...
Current
=
CurrentElement
Is FrontFace?
Push
Yes No
SearchFront-face
Hit0
Hit3
Hit2
...
Hit1
0
0
1
0
0
1
0
0Collision
PairGenerator
FragmentsAbove Front-Face?
SearchFront-face
Is any fragmentabove it?
Collision!!!Generate
Collision Pairs
Yes No
SearchFront-Face
Enrique de Lucas 3. Render-Based CD in the GPU 67
Z-Overlap TestZEB
List-Register
Stack
...
=
=
=
...
Current
=
CurrentElement
Is FrontFace?
Push
Yes No
SearchFront-face
Hit0
Hit3
Hit2
...
Hit1
0
0
1
0
0
1
0
0Collision
PairGenerator
FragmentsAbove Front-Face?
<red, blue>
SearchFront-face
Is any fragmentabove it?
Collision!!!Generate
Collision Pairs
Collision!!!Generate
Collision Pairs
Yes No
SearchFront-Face
Enrique de Lucas 3. Render-Based CD in the GPU 68
Z-Overlap TestZEB
List-Register
Stack
...
=
=
=
...
Current
=
CurrentElement
Is FrontFace?
Push
Yes No
SearchFront-face
Hit0
Hit3
Hit2
...
Hit1
0
0
1
0
0
1
0
0Collision
PairGenerator
FragmentsAbove Front-Face?
SearchFront-face
Is any fragmentabove it?
Collision!!!Generate
Collision Pairs
Collision!!!Generate
Collision Pairs
Yes No
SearchFront-Face
Enrique de Lucas 3. Render-Based CD in the GPU 69
Z-Overlap TestZEB
List-Register
Stack
...
=
=
=
...
Current
=
CurrentElement
Is FrontFace?
Push
Yes No
SearchFront-face
Hit0
Hit3
Hit2
...
Hit1 CollisionPair
Generator
FragmentsAbove Front-Face?
Is any fragmentabove it?
Collision!!!Generate
Collision Pairs
Yes No
SearchFront-Face
Enrique de Lucas 3. Render-Based CD in the GPU 70
Z-Overlap TestZEB
List-Register
Stack
...
=
=
=
...
Current
=
CurrentElement
Is FrontFace?
Push
Yes No
SearchFront-face
Hit0
Hit3
Hit2
...
Hit1 CollisionPair
Generator
FragmentsAbove Front-Face?
Is any fragmentabove it?
Collision!!!Generate
Collision Pairs
Yes No
Is FrontFace?
SearchFront-Face
Enrique de Lucas 3. Render-Based CD in the GPU 71
Z-Overlap TestZEB
List-Register
Stack
...
=
=
=
...
Current
=
CurrentElement
Is FrontFace?
Push
Yes No
SearchFront-face
Hit0
Hit3
Hit2
...
Hit1 CollisionPair
Generator
FragmentsAbove Front-Face?
SearchFront-face
Is any fragmentabove it?
Collision!!!Generate
Collision Pairs
Yes No
SearchFront-Face
Enrique de Lucas 3. Render-Based CD in the GPU 72
Z-Overlap TestZEB
List-Register
Stack
...
=
=
=
...
Current
=
CurrentElement
Is FrontFace?
Push
Yes No
SearchFront-face
Hit0
Hit3
Hit2
...
Hit1
0
1
0
0Collision
PairGenerator
FragmentsAbove Front-Face?
SearchFront-face
Is any fragmentabove it?
Collision!!!Generate
Collision Pairs
Yes No
SearchFront-Face
Enrique de Lucas 3. Render-Based CD in the GPU 73
Z-Overlap TestZEB
List-Register
Stack
...
=
=
=
...
Current
=
CurrentElement
Is FrontFace?
Push
Yes No
SearchFront-face
Hit0
Hit3
Hit2
...
Hit1
0
1
0
0Collision
PairGenerator
FragmentsAbove Front-Face?
Is any fragmentabove it?
Collision!!!Generate
Collision Pairs
Yes NoIs any fragmentabove it?
SearchFront-Face
Enrique de Lucas 3. Render-Based CD in the GPU 74
Z-Overlap TestZEB
List-Register
Stack
...
=
=
=
...
Current
=
CurrentElement
Is FrontFace?
Push
Yes No
SearchFront-face
Hit0
Hit3
Hit2
...
Hit1
0
1
0
0
0
0
0
0Collision
PairGenerator
FragmentsAbove Front-Face?
Is any fragmentabove it?
Collision!!!Generate
Collision Pairs
Yes NoIs any fragmentabove it?
SearchFront-Face
Enrique de Lucas 3. Render-Based CD in the GPU 75
Z-Overlap TestZEB
List-Register
Stack
...
=
=
=
...
Current
=
CurrentElement
Is FrontFace?
Push
Yes No
SearchFront-face
Hit0
Hit3
Hit2
...
Hit1 CollisionPair
Generator
FragmentsAbove Front-Face?
Is any fragmentabove it?
Collision!!!Generate
Collision Pairs
Yes No
SearchFront-Face
Enrique de Lucas 76
Outline
1. Motivation
2. Collision Detection (CD)
3. Render-Based CD in the GPU
4. Results
5. Conclusion
Enrique de Lucas 4. Results 77
Evaluation Methodology● TEAPOT simulation infrastructure
– Android and OpenGL ES– GPU timing simulator models:
● Tile-Based Rendering architecture (ARM Mali 400MP-like)
– GPU power model based on McPAT● Extended with RBCD unit
● Marss cycle accurate full system simulator● Workloads: 4 unmodified Android commercial games
Benchmark Type Donwnloads
Captain America beat'em up 10-50 M
Crazy Snowboard snowboard arcade 5-10 M
Sleepy Jack action 100-500 K
Temple Run adventure arcade 100-500 M
Enrique de Lucas 4. Results 78
Performance and Energy of CD
RBCD vs CD on a CPU
● 3400x speedup and 2875x energy reduction on average
– RBCD reuses rendering results
capcrazy
sleepytemple
geo.mean
0
1000
2000
3000
4000
5000
6000
Sp
ee
dup
capcrazy
sleepytemple
geo.mean
0
1000
2000
3000
4000
5000
6000
Ene
rgy
Re
duc
tion
Enrique de Lucas 4. Results 79
GPU Overhead
Row 1 Row 2 Row 3 Row 40
2
4
6
8
10
12
Column 1
Column 2
Column 3
capcrazy
sleepytemple
geo.mean
0
0.2
0.4
0.6
0.8
1
No
rmal
ize
d E
nerg
y
capcrazy
sleepytemple
geo.mean
0
0.2
0.4
0.6
0.8
1
No
rmal
ize
d T
ime
● Energy: 3.5%● Time overhead: 3%
● Area overhead: < 1%
Enrique de Lucas 4. Results 80
GPU Overhead: Face Culling
cap crazy sleepy temple geo.mean
Triangles Fragments
Nor
mal
ized
Rasterize 18% more triangles, 6% more fragments
FaceCulling
Rasterization
Fragments
FragmentProcessing
Common Face Culling:
FaceCulling
RasterizationFragments
FragmentProcessing
Fragments
CD
Deferred Face Culling:
Enrique de Lucas 5. Conclusions 81
Conclusions
● Energy budget limits the quality and realism of CD in mobile graphics animation applications
● Most of the computation required by Image-Based CD is already done in image rendering
● RBCD provides
– 2875x energy reduction– 3400x speedup– Pixel level accuracy
– Small overheads: time (3%), energy (3.5%) and area (1%)
● RBCD is a low-energy yet high-fidelity CD solution
48th International Symposium on Microarchitecture
Ultra-Low Power Render-Based Collision Detection
for CPU/GPU Systems
Enrique de Lucas Pedro Marcuello Joan-Manuel Parcerisa Antonio González