82
48 th International Symposium on Microarchitecture Ultra-Low Power Render-Based Collision Detection for CPU/GPU Systems Enrique de Lucas Pedro Marcuello Joan-Manuel Parcerisa Antonio González

Ultra-Low Power Render-Based Collision Detection for CPU ... · “CPU, GPU and screen are the dominant energy consumers on a smartphone”2 Graphics animation applications are quite

  • Upload
    others

  • View
    7

  • Download
    0

Embed Size (px)

Citation preview

Page 1: Ultra-Low Power Render-Based Collision Detection for CPU ... · “CPU, GPU and screen are the dominant energy consumers on a smartphone”2 Graphics animation applications are quite

48th International Symposium on Microarchitecture

Ultra-Low Power Render-Based Collision Detection

for CPU/GPU Systems

Enrique de Lucas Pedro Marcuello Joan-Manuel Parcerisa Antonio González

Page 2: Ultra-Low Power Render-Based Collision Detection for CPU ... · “CPU, GPU and screen are the dominant energy consumers on a smartphone”2 Graphics animation applications are quite

Enrique de Lucas 1. Motivation 2

Mobile devices market is growing fast

2008 2009 2010 2011 2012 2013 20140

200

400

600

800

1000

1200

1400

Mill

ion

s o

f un

it s

hip

pe

d

Market

PC

Smartphone

Page 3: Ultra-Low Power Render-Based Collision Detection for CPU ... · “CPU, GPU and screen are the dominant energy consumers on a smartphone”2 Graphics animation applications are quite

Enrique de Lucas 1. Motivation 3

Mobile Systems

● Users demand realistic and complex graphics like in laptops and desktops

– Battery life is about 4 hours for GFXBench 3.0!1

– Heat dissipation

● “CPU, GPU and screen are the dominant energy consumers on a smartphone”2

● Graphics animation applications are quite popular

– Collision Detection is an important task

1. With an ARM Mali 400MP GPU, www.gfxbench.com

2. Mittal et al., Empowering Developers to Estimate App Energy Consumption, Aug. 2012.

Page 4: Ultra-Low Power Render-Based Collision Detection for CPU ... · “CPU, GPU and screen are the dominant energy consumers on a smartphone”2 Graphics animation applications are quite

Enrique de Lucas 4

Outline

1. Motivation

2. Collision Detection (CD)

3. Render-Based CD in the GPU

4. Results

5. Conclusions

Page 5: Ultra-Low Power Render-Based Collision Detection for CPU ... · “CPU, GPU and screen are the dominant energy consumers on a smartphone”2 Graphics animation applications are quite

Enrique de Lucas 2. Collision Detection 5

Collision Detection (CD)Frame i Frame i+2

CD identifies the contact points between objects

Bounding Volume

Contact!

Page 6: Ultra-Low Power Render-Based Collision Detection for CPU ... · “CPU, GPU and screen are the dominant energy consumers on a smartphone”2 Graphics animation applications are quite

Enrique de Lucas 2. Collision Detection 6

Bounding Volumes

false area = false collisions

False Collisions!

Collision

Bounding Volume Cuboid Convex Hull

Accuracy Low Medium

Computing Cost Low High

Page 7: Ultra-Low Power Render-Based Collision Detection for CPU ... · “CPU, GPU and screen are the dominant energy consumers on a smartphone”2 Graphics animation applications are quite

Enrique de Lucas 2. Collision Detection 7

Image-Based CD (IBCD)

● CD performed at pixel granularity● No Bounding Volumes● Higher Accuracy● Computing/Energy Cost

● CPU: huge● Our technique (GPU): tiny

Page 8: Ultra-Low Power Render-Based Collision Detection for CPU ... · “CPU, GPU and screen are the dominant energy consumers on a smartphone”2 Graphics animation applications are quite

Enrique de Lucas 2. Collision Detection 8

How does IBCD work?

Steps of Image-Based CD

Depth

Screen

P1

Page 9: Ultra-Low Power Render-Based Collision Detection for CPU ... · “CPU, GPU and screen are the dominant energy consumers on a smartphone”2 Graphics animation applications are quite

Enrique de Lucas 2. Collision Detection 9

How does IBCD work?

Depth

P1

ScreenPixels

Steps of Image-Based CD

Depth

Screen

P1

Page 10: Ultra-Low Power Render-Based Collision Detection for CPU ... · “CPU, GPU and screen are the dominant energy consumers on a smartphone”2 Graphics animation applications are quite

Enrique de Lucas 2. Collision Detection 10

How does IBCD work?

1) Project objects onto a plane (screen)

Depth

P1

ScreenPixels

Steps of Image-Based CD

Depth

Screen

P1

Page 11: Ultra-Low Power Render-Based Collision Detection for CPU ... · “CPU, GPU and screen are the dominant energy consumers on a smartphone”2 Graphics animation applications are quite

Enrique de Lucas 2. Collision Detection 11

How does IBCD work?

1) Project objects onto a plane (screen)

Depth

P1

ScreenPixels

Steps of Image-Based CD

Depth

Screen

P1

Page 12: Ultra-Low Power Render-Based Collision Detection for CPU ... · “CPU, GPU and screen are the dominant energy consumers on a smartphone”2 Graphics animation applications are quite

Enrique de Lucas 2. Collision Detection 12

How does IBCD work?

1) Project objects onto a plane (screen)

Depth

P1

ScreenPixels

Steps of Image-Based CD

Depth

Screen

P1

Page 13: Ultra-Low Power Render-Based Collision Detection for CPU ... · “CPU, GPU and screen are the dominant energy consumers on a smartphone”2 Graphics animation applications are quite

Enrique de Lucas 2. Collision Detection 13

How does IBCD work?

1) Project objects onto a plane (screen)

Depth

P1

ScreenPixels

Steps of Image-Based CD

Depth

Screen

P1

Page 14: Ultra-Low Power Render-Based Collision Detection for CPU ... · “CPU, GPU and screen are the dominant energy consumers on a smartphone”2 Graphics animation applications are quite

Enrique de Lucas 2. Collision Detection 14

List for P1

How does IBCD work?

1) Project objects onto a plane (screen)

2) Rasterize surface of theobjects and store depthsin a list

Depth

P1

ScreenPixels

Steps of Image-Based CD

Depth

Screen

P1

Page 15: Ultra-Low Power Render-Based Collision Detection for CPU ... · “CPU, GPU and screen are the dominant energy consumers on a smartphone”2 Graphics animation applications are quite

Enrique de Lucas 2. Collision Detection 15

List for P1

How does IBCD work?

1) Project objects onto a plane (screen)

2) Rasterize surface of theobjects and store depthsin a list

Depth

P1

ScreenPixels

Steps of Image-Based CD

Depth

Screen

P1

Page 16: Ultra-Low Power Render-Based Collision Detection for CPU ... · “CPU, GPU and screen are the dominant energy consumers on a smartphone”2 Graphics animation applications are quite

Enrique de Lucas 2. Collision Detection 16

List for P1

How does IBCD work?

1) Project objects onto a plane (screen)

2) Rasterize surface of theobjects and store depthsin a list

Depth

P1

ScreenPixels

1

Steps of Image-Based CD

Depth

Screen

1

P1

Page 17: Ultra-Low Power Render-Based Collision Detection for CPU ... · “CPU, GPU and screen are the dominant energy consumers on a smartphone”2 Graphics animation applications are quite

Enrique de Lucas 2. Collision Detection 17

List for P1

How does IBCD work?

1) Project objects onto a plane (screen)

2) Rasterize surface of theobjects and store depthsin a list

Depth

P1

ScreenPixels

1 7

Steps of Image-Based CD

Depth

Screen

1 7

P1

Page 18: Ultra-Low Power Render-Based Collision Detection for CPU ... · “CPU, GPU and screen are the dominant energy consumers on a smartphone”2 Graphics animation applications are quite

Enrique de Lucas 2. Collision Detection 18

List for P1

How does IBCD work?

1) Project objects onto a plane (screen)

2) Rasterize surface of theobjects and store depthsin a list

Depth

P1

ScreenPixels

1 7 5

Steps of Image-Based CD

Depth

Screen

1 5 7

P1

Page 19: Ultra-Low Power Render-Based Collision Detection for CPU ... · “CPU, GPU and screen are the dominant energy consumers on a smartphone”2 Graphics animation applications are quite

Enrique de Lucas 2. Collision Detection 19

List for P1

How does IBCD work?

1) Project objects onto a plane (screen)

2) Rasterize surface of theobjects and store depthsin a list

Depth

P1

ScreenPixels

1 7 5 2

Steps of Image-Based CD

Depth

Screen

1 2 5 7

P1

Page 20: Ultra-Low Power Render-Based Collision Detection for CPU ... · “CPU, GPU and screen are the dominant energy consumers on a smartphone”2 Graphics animation applications are quite

Enrique de Lucas 2. Collision Detection 20

List for P1

How does IBCD work?

1) Project objects onto a plane (screen)

2) Rasterize surface of theobjects and store depthsin a list

Depth

P1

ScreenPixels

1 7 5 2 6

Steps of Image-Based CD

Depth

Screen

1 2 5 6 7

P1

Page 21: Ultra-Low Power Render-Based Collision Detection for CPU ... · “CPU, GPU and screen are the dominant energy consumers on a smartphone”2 Graphics animation applications are quite

Enrique de Lucas 2. Collision Detection 21

List for P1

How does IBCD work?

1) Project objects onto a plane (screen)

2) Rasterize surface of theobjects and store depthsin a list

Depth

P1

ScreenPixels

1 7 5 2 6 8

Steps of Image-Based CD

Depth

Screen

1 2 5 6 7 8

P1

Page 22: Ultra-Low Power Render-Based Collision Detection for CPU ... · “CPU, GPU and screen are the dominant energy consumers on a smartphone”2 Graphics animation applications are quite

Enrique de Lucas 2. Collision Detection 22

List for P1

How does IBCD work?

1) Project objects onto a plane (screen)

2) Rasterize surface of theobjects and store depthsin a list

Depth

P1

ScreenPixels

1 7 5 2 6 8 3

Steps of Image-Based CD

Depth

Screen

1 2 3 5 6 7 8

P1

Page 23: Ultra-Low Power Render-Based Collision Detection for CPU ... · “CPU, GPU and screen are the dominant energy consumers on a smartphone”2 Graphics animation applications are quite

Enrique de Lucas 2. Collision Detection 23

List for P1

How does IBCD work?

1) Project objects onto a plane (screen)

2) Rasterize surface of theobjects and store depthsin a list

Depth

P1

ScreenPixels

1 7 5 2 6 8 3 4

Steps of Image-Based CD

Depth

Screen

1 2 3 4 5 6 7 8

P1

Page 24: Ultra-Low Power Render-Based Collision Detection for CPU ... · “CPU, GPU and screen are the dominant energy consumers on a smartphone”2 Graphics animation applications are quite

Enrique de Lucas 2. Collision Detection 24

List for P1

How does IBCD work?

1) Project objects onto a plane (screen)

2) Rasterize surface of theobjects and store depthsin a list

3) Sort values by depth

Depth

P1

ScreenPixels

1 7 5 2 6 8 3 4

1 2 5 83 4Sort 76

Steps of Image-Based CD

Depth

Screen

1 2 3 4 5 6 7 8

P1

Page 25: Ultra-Low Power Render-Based Collision Detection for CPU ... · “CPU, GPU and screen are the dominant energy consumers on a smartphone”2 Graphics animation applications are quite

Enrique de Lucas 2. Collision Detection 25

List for P1

How does IBCD work?

1) Project objects onto a plane (screen)

2) Rasterize surface of theobjects and store depthsin a list

3) Sort values by depth

4) Detect overlappingdepth-ranges

Depth

P1

ScreenPixels

1 7 5 2 6 8 3 4

1 2 5 83 4Sort 76

Steps of Image-Based CD

Depth-ranges

Depth

Screen

1 2 3 4 5 6 7 8

P1

Page 26: Ultra-Low Power Render-Based Collision Detection for CPU ... · “CPU, GPU and screen are the dominant energy consumers on a smartphone”2 Graphics animation applications are quite

Enrique de Lucas 2. Collision Detection 26

List for P1

How does IBCD work?

1) Project objects onto a plane (screen)

2) Rasterize surface of theobjects and store depthsin a list

3) Sort values by depth

4) Detect overlappingdepth-ranges

Depth

P1

ScreenPixels

1 7 5 2 6 8 3 4

1 2 5 83 4Sort 76

Steps of Image-Based CD

Depth-ranges

Depth

Screen

1 2 3 4 5 6 7 8

P1

Page 27: Ultra-Low Power Render-Based Collision Detection for CPU ... · “CPU, GPU and screen are the dominant energy consumers on a smartphone”2 Graphics animation applications are quite

Enrique de Lucas 2. Collision Detection 27

List for P1

How does IBCD work?

1) Project objects onto a plane (screen)

2) Rasterize surface of theobjects and store depthsin a list

3) Sort values by depth

4) Detect overlappingdepth-ranges

Depth

P1

ScreenPixels

1 7 5 2 6 8 3 4

1 2 5 83 4Sort 76

Steps of Image-Based CD

Depth-ranges

Depth

Screen

1 2 3 4 5 6 7 8

P1

Page 28: Ultra-Low Power Render-Based Collision Detection for CPU ... · “CPU, GPU and screen are the dominant energy consumers on a smartphone”2 Graphics animation applications are quite

Enrique de Lucas 2. Collision Detection 28

List for P1

How does IBCD work?

1) Project objects onto a plane (screen)

2) Rasterize surface of theobjects and store depthsin a list

3) Sort values by depth

4) Detect overlappingdepth-ranges

Depth

P1

ScreenPixels

1 7 5 2 6 8 3 4

1 2 5 83 4Sort 76

Steps of Image-Based CD

Depth-ranges

Depth

Screen

1 2 3 4 5 6 7 8

P1

Page 29: Ultra-Low Power Render-Based Collision Detection for CPU ... · “CPU, GPU and screen are the dominant energy consumers on a smartphone”2 Graphics animation applications are quite

Enrique de Lucas 2. Collision Detection 29

List for P1

How does IBCD work?

1) Project objects onto a plane (screen)

2) Rasterize surface of theobjects and store depthsin a list

3) Sort values by depth

4) Detect overlappingdepth-ranges

Depth

P1

ScreenPixels

1 7 5 2 6 8 3 4

1 2 5 83 4Sort 76

Steps of Image-Based CD

Depth-ranges

Depth

Screen

1 2 3 4 5 6 7 8

P1

Page 30: Ultra-Low Power Render-Based Collision Detection for CPU ... · “CPU, GPU and screen are the dominant energy consumers on a smartphone”2 Graphics animation applications are quite

Enrique de Lucas 2. Collision Detection 30

List for P1

How does IBCD work?

1) Project objects onto a plane (screen)

2) Rasterize surface of theobjects and store depthsin a list

3) Sort values by depth

4) Detect overlappingdepth-ranges

Depth

P1

ScreenPixels

1 7 5 2 6 8

Overlap!

3 4

1 2 5 83 4Sort 76

Steps of Image-Based CDCollision!

Depth-ranges

Depth

Screen

1 2 3 4 5 6 7 8

P1

Page 31: Ultra-Low Power Render-Based Collision Detection for CPU ... · “CPU, GPU and screen are the dominant energy consumers on a smartphone”2 Graphics animation applications are quite

Enrique de Lucas 2. Collision Detection 31

List for P1

How does IBCD work?

1) Project objects onto a plane (screen)

2) Rasterize surface of theobjects and store depthsin a list

3) Sort values by depth

4) Detect overlappingdepth-ranges

Depth

P1

ScreenPixels

1 7 5 2 6 8

Overlap!

3 4

1 2 5 83 4Sort 76

Steps of Image-Based CDCollision!

Depth-ranges

Depth

Screen

1 2 3 4 5 6 7 8

P1

Page 32: Ultra-Low Power Render-Based Collision Detection for CPU ... · “CPU, GPU and screen are the dominant energy consumers on a smartphone”2 Graphics animation applications are quite

Enrique de Lucas 2. Collision Detection 32

List for P1

How does IBCD work?

1) Project objects onto a plane (screen)

2) Rasterize surface of theobjects and store depthsin a list

3) Sort values by depth

4) Detect overlappingdepth-ranges

Depth

P1

ScreenPixels

1 7 5 2 6 8

Overlap!

3 4

1 2 5 83 4Sort 76

Steps of Image-Based CDCollision!

Already done for image rendering!

Depth-ranges

Depth

Screen

1 2 3 4 5 6 7 8

P1

Page 33: Ultra-Low Power Render-Based Collision Detection for CPU ... · “CPU, GPU and screen are the dominant energy consumers on a smartphone”2 Graphics animation applications are quite

Enrique de Lucas 33

Outline

1. Motivation

2. Collision Detection (CD)

3. Render-Based CD in the GPU

4. Results

5. Conclusion

Page 34: Ultra-Low Power Render-Based Collision Detection for CPU ... · “CPU, GPU and screen are the dominant energy consumers on a smartphone”2 Graphics animation applications are quite

Enrique de Lucas 3. Render-Based CD in the GPU 34

Vertex

processor

Memory

controller

Rasterizer

Geometry Stage Raster Stage

Programmable Fixed-Function

CD Integration into GPU pipeline

Vertices Triangles Fragments

RBCDunit

CollisionPoints

GPU Commands

New Hardware

Color

Buffer

Fragment

processorFragment

processorFragment

processorFragment

processor

CollisionableFragments

Image

Graphics Pipeline from 10,000 feet

Memory

Page 35: Ultra-Low Power Render-Based Collision Detection for CPU ... · “CPU, GPU and screen are the dominant energy consumers on a smartphone”2 Graphics animation applications are quite

Enrique de Lucas 3. Render-Based CD in the GPU 35

RBCD unit

ZEB buffer

Insertion

Sort

Z-Overlap

Test

1) Store fragments sorted by depth2) Detects collision points

...

List 1 / Line 1 (32 B)

List 2 / Line 2 (32 B)

List 256 / Line 256 (32 B)

...

Page 36: Ultra-Low Power Render-Based Collision Detection for CPU ... · “CPU, GPU and screen are the dominant energy consumers on a smartphone”2 Graphics animation applications are quite

Enrique de Lucas 3. Render-Based CD in the GPU 36

RBCD unit

ZEB buffer

Insertion

Sort

Z-Overlap

Test

1) Store fragments sorted by depth2) Detects collision points

...

ZEB buffer (on-chip array)1 list of fragments per every pixel

16x16 lists, 8 entries per list8 KB (16x16x8x32B)

List 1 / Line 1 (32 B)

List 2 / Line 2 (32 B)

List 256 / Line 256 (32 B)

...

Page 37: Ultra-Low Power Render-Based Collision Detection for CPU ... · “CPU, GPU and screen are the dominant energy consumers on a smartphone”2 Graphics animation applications are quite

Enrique de Lucas 3. Render-Based CD in the GPU 37

RBCD unit

ZEB buffer

Insertion

Sort

Z-Overlap

TestSurface Points

(Fragments)

1) Store fragments sorted by depth2) Detects collision points

...

ZEB buffer (on-chip array)1 list of fragments per every pixel

16x16 lists, 8 entries per list8 KB (16x16x8x32B)

List 1 / Line 1 (32 B)

List 2 / Line 2 (32 B)

List 256 / Line 256 (32 B)

...

Page 38: Ultra-Low Power Render-Based Collision Detection for CPU ... · “CPU, GPU and screen are the dominant energy consumers on a smartphone”2 Graphics animation applications are quite

Enrique de Lucas 3. Render-Based CD in the GPU 38

RBCD unit

ZEB buffer

Insertion

Sort

Z-Overlap

TestSurface Points

(Fragments)

1) Store fragments sorted by depth2) Detects collision points

Insertion

Sort

...

ZEB buffer (on-chip array)1 list of fragments per every pixel

16x16 lists, 8 entries per list8 KB (16x16x8x32B)

List 1 / Line 1 (32 B)

List 2 / Line 2 (32 B)

List 256 / Line 256 (32 B)

...

Page 39: Ultra-Low Power Render-Based Collision Detection for CPU ... · “CPU, GPU and screen are the dominant energy consumers on a smartphone”2 Graphics animation applications are quite

Enrique de Lucas 3. Render-Based CD in the GPU 39

RBCD unit

ZEB buffer

Insertion

Sort

Z-Overlap

TestSurface Points

(Fragments)

1) Store fragments sorted by depth2) Detects collision points

Z-Overlap

Test

...

ZEB buffer (on-chip array)1 list of fragments per every pixel

16x16 lists, 8 entries per list8 KB (16x16x8x32B)

List 1 / Line 1 (32 B)

List 2 / Line 2 (32 B)

List 256 / Line 256 (32 B)

...

Page 40: Ultra-Low Power Render-Based Collision Detection for CPU ... · “CPU, GPU and screen are the dominant energy consumers on a smartphone”2 Graphics animation applications are quite

Enrique de Lucas 3. Render-Based CD in the GPU 40

RBCD unit

ZEB buffer

Insertion

Sort

Z-Overlap

TestSurface Points

(Fragments)Collision Points

1) Store fragments sorted by depth2) Detects collision points

Z-Overlap

Test

...

ZEB buffer (on-chip array)1 list of fragments per every pixel

16x16 lists, 8 entries per list8 KB (16x16x8x32B)

List 1 / Line 1 (32 B)

List 2 / Line 2 (32 B)

List 256 / Line 256 (32 B)

...

Page 41: Ultra-Low Power Render-Based Collision Detection for CPU ... · “CPU, GPU and screen are the dominant energy consumers on a smartphone”2 Graphics animation applications are quite

Enrique de Lucas 3. Render-Based CD in the GPU 41

RBCD unit

ZEB buffer

Insertion

Sort

Z-Overlap

TestSurface Points

(Fragments)Collision Points

1) Store fragments sorted by depth2) Detects collision points

...

ZEB buffer (on-chip array)1 list of fragments per every pixel

16x16 lists, 8 entries per list8 KB (16x16x8x32B)

List 1 / Line 1 (32 B)

List 2 / Line 2 (32 B)

List 256 / Line 256 (32 B)

...

Page 42: Ultra-Low Power Render-Based Collision Detection for CPU ... · “CPU, GPU and screen are the dominant energy consumers on a smartphone”2 Graphics animation applications are quite

Enrique de Lucas 3. Render-Based CD in the GPU 42

Insertion Sort

...ZEB

2 6

...List-Register

3 New

<...<

<

<

...

...

Muxes

Comparators Read one list

CompareShift

Write new list

Page 43: Ultra-Low Power Render-Based Collision Detection for CPU ... · “CPU, GPU and screen are the dominant energy consumers on a smartphone”2 Graphics animation applications are quite

Enrique de Lucas 3. Render-Based CD in the GPU 43

Insertion Sort

...ZEB

2 6

...2 6List-Register

3 New

<...<

<

<

...

...

Muxes

Comparators Read one list

CompareShift

Write new list

Page 44: Ultra-Low Power Render-Based Collision Detection for CPU ... · “CPU, GPU and screen are the dominant energy consumers on a smartphone”2 Graphics animation applications are quite

Enrique de Lucas 3. Render-Based CD in the GPU 44

Insertion Sort

...ZEB

2 6

...2 6List-Register

3 New

<...<

<

<

...

...

Muxes

Comparators Read one list

CompareShift

632

Page 45: Ultra-Low Power Render-Based Collision Detection for CPU ... · “CPU, GPU and screen are the dominant energy consumers on a smartphone”2 Graphics animation applications are quite

Enrique de Lucas 3. Render-Based CD in the GPU 45

Insertion Sort

...ZEB

2 6

...2 6List-Register

3 New

<...<

<

<

...

...

Muxes

Comparators Read one list

CompareShift

Write new list

632

63

Page 46: Ultra-Low Power Render-Based Collision Detection for CPU ... · “CPU, GPU and screen are the dominant energy consumers on a smartphone”2 Graphics animation applications are quite

Enrique de Lucas 3. Render-Based CD in the GPU 46

Z-Overlap TestZEB

List-Register

Stack

...

=

=

=

...

Current

=

CurrentElement

Is FrontFace?

Push

Yes No

SearchFront-face

Hit0

Hit3

Hit2

...

Hit1 CollisionPair

Generator

FragmentsAbove Front-Face?

Is any fragmentabove it?

Collision!!!Generate

Collision Pairs

Yes No

SearchFront-Face

Page 47: Ultra-Low Power Render-Based Collision Detection for CPU ... · “CPU, GPU and screen are the dominant energy consumers on a smartphone”2 Graphics animation applications are quite

Enrique de Lucas 3. Render-Based CD in the GPU 47

Z-Overlap TestZEB

List-Register

Stack

...

=

=

=

...

Current

=

CurrentElement

Is FrontFace?

Push

Yes No

SearchFront-face

Hit0

Hit3

Hit2

...

Hit1 CollisionPair

Generator

FragmentsAbove Front-Face?

Is any fragmentabove it?

Collision!!!Generate

Collision Pairs

Yes No

SearchFront-Face

Page 48: Ultra-Low Power Render-Based Collision Detection for CPU ... · “CPU, GPU and screen are the dominant energy consumers on a smartphone”2 Graphics animation applications are quite

Enrique de Lucas 3. Render-Based CD in the GPU 48

Z-Overlap TestZEB

List-Register

Stack

...

=

=

=

...

Current

=

CurrentElement

Is FrontFace?

Push

Yes No

SearchFront-face

Hit0

Hit3

Hit2

...

Hit1 CollisionPair

Generator

FragmentsAbove Front-Face?

Is any fragmentabove it?

Collision!!!Generate

Collision Pairs

Yes No

SearchFront-Face

Page 49: Ultra-Low Power Render-Based Collision Detection for CPU ... · “CPU, GPU and screen are the dominant energy consumers on a smartphone”2 Graphics animation applications are quite

Enrique de Lucas 3. Render-Based CD in the GPU 49

Z-Overlap TestZEB

List-Register

Stack

...

=

=

=

...

Current

=

CurrentElement

Is FrontFace?

Push

Yes No

SearchFront-face

Hit0

Hit3

Hit2

...

Hit1 CollisionPair

Generator

FragmentsAbove Front-Face?

Is any fragmentabove it?

Collision!!!Generate

Collision Pairs

Yes No

Is FrontFace?

SearchFront-Face

Page 50: Ultra-Low Power Render-Based Collision Detection for CPU ... · “CPU, GPU and screen are the dominant energy consumers on a smartphone”2 Graphics animation applications are quite

Enrique de Lucas 3. Render-Based CD in the GPU 50

Z-Overlap TestZEB

List-Register

Stack

...

=

=

=

...

Current

=

CurrentElement

Is FrontFace?

Push

Yes No

SearchFront-face

Hit0

Hit3

Hit2

...

Hit1 CollisionPair

Generator

FragmentsAbove Front-Face?

Push

Is any fragmentabove it?

Collision!!!Generate

Collision Pairs

Yes No

SearchFront-Face

Page 51: Ultra-Low Power Render-Based Collision Detection for CPU ... · “CPU, GPU and screen are the dominant energy consumers on a smartphone”2 Graphics animation applications are quite

Enrique de Lucas 3. Render-Based CD in the GPU 51

Z-Overlap TestZEB

List-Register

Stack

...

=

=

=

...

Current

=

CurrentElement

Is FrontFace?

Push

Yes No

SearchFront-face

Hit0

Hit3

Hit2

...

Hit1 CollisionPair

Generator

FragmentsAbove Front-Face?

Is any fragmentabove it?

Collision!!!Generate

Collision Pairs

Yes No

SearchFront-Face

Page 52: Ultra-Low Power Render-Based Collision Detection for CPU ... · “CPU, GPU and screen are the dominant energy consumers on a smartphone”2 Graphics animation applications are quite

Enrique de Lucas 3. Render-Based CD in the GPU 52

Z-Overlap TestZEB

List-Register

Stack

...

=

=

=

...

Current

=

CurrentElement

Is FrontFace?

Push

Yes No

SearchFront-face

Hit0

Hit3

Hit2

...

Hit1 CollisionPair

Generator

FragmentsAbove Front-Face?

Is any fragmentabove it?

Collision!!!Generate

Collision Pairs

Yes No

Is FrontFace?

SearchFront-Face

Page 53: Ultra-Low Power Render-Based Collision Detection for CPU ... · “CPU, GPU and screen are the dominant energy consumers on a smartphone”2 Graphics animation applications are quite

Enrique de Lucas 3. Render-Based CD in the GPU 53

Z-Overlap TestZEB

List-Register

Stack

...

=

=

=

...

Current

=

CurrentElement

Is FrontFace?

Push

Yes No

SearchFront-face

Hit0

Hit3

Hit2

...

Hit1

0

0

0

1Collision

PairGenerator

FragmentsAbove Front-Face?

SearchFront-face

Is any fragmentabove it?

Collision!!!Generate

Collision Pairs

Yes No

SearchFront-Face

Page 54: Ultra-Low Power Render-Based Collision Detection for CPU ... · “CPU, GPU and screen are the dominant energy consumers on a smartphone”2 Graphics animation applications are quite

Enrique de Lucas 3. Render-Based CD in the GPU 54

Z-Overlap TestZEB

List-Register

Stack

...

=

=

=

...

Current

=

CurrentElement

Is FrontFace?

Push

Yes No

SearchFront-face

Hit0

Hit3

Hit2

...

Hit1

0

0

0

1

0

0

0

0Collision

PairGenerator

FragmentsAbove Front-Face?

Is any fragmentabove it?

Collision!!!Generate

Collision Pairs

Yes NoIs any fragmentabove it?

SearchFront-Face

Page 55: Ultra-Low Power Render-Based Collision Detection for CPU ... · “CPU, GPU and screen are the dominant energy consumers on a smartphone”2 Graphics animation applications are quite

Enrique de Lucas 3. Render-Based CD in the GPU 55

Z-Overlap TestZEB

List-Register

Stack

...

=

=

=

...

Current

=

CurrentElement

Is FrontFace?

Push

Yes No

SearchFront-face

Hit0

Hit3

Hit2

...

Hit1 CollisionPair

Generator

FragmentsAbove Front-Face?

Is any fragmentabove it?

Collision!!!Generate

Collision Pairs

Yes NoIs any fragmentabove it?

SearchFront-Face

Page 56: Ultra-Low Power Render-Based Collision Detection for CPU ... · “CPU, GPU and screen are the dominant energy consumers on a smartphone”2 Graphics animation applications are quite

Enrique de Lucas 3. Render-Based CD in the GPU 56

Z-Overlap TestZEB

List-Register

Stack

...

=

=

=

...

Current

=

CurrentElement

Is FrontFace?

Push

Yes No

SearchFront-face

Hit0

Hit3

Hit2

...

Hit1 CollisionPair

Generator

FragmentsAbove Front-Face?

Is any fragmentabove it?

Collision!!!Generate

Collision Pairs

Yes No

Is FrontFace?

SearchFront-Face

Page 57: Ultra-Low Power Render-Based Collision Detection for CPU ... · “CPU, GPU and screen are the dominant energy consumers on a smartphone”2 Graphics animation applications are quite

Enrique de Lucas 3. Render-Based CD in the GPU 57

Z-Overlap TestZEB

List-Register

Stack

...

=

=

=

...

Current

=

CurrentElement

Is FrontFace?

Push

Yes No

SearchFront-face

Hit0

Hit3

Hit2

...

Hit1 CollisionPair

Generator

FragmentsAbove Front-Face?

Push

Is any fragmentabove it?

Collision!!!Generate

Collision Pairs

Yes No

SearchFront-Face

Page 58: Ultra-Low Power Render-Based Collision Detection for CPU ... · “CPU, GPU and screen are the dominant energy consumers on a smartphone”2 Graphics animation applications are quite

Enrique de Lucas 3. Render-Based CD in the GPU 58

Z-Overlap TestZEB

List-Register

Stack

...

=

=

=

...

Current

=

CurrentElement

Is FrontFace?

Push

Yes No

SearchFront-face

Hit0

Hit3

Hit2

...

Hit1 CollisionPair

Generator

FragmentsAbove Front-Face?

Is any fragmentabove it?

Collision!!!Generate

Collision Pairs

Yes No

SearchFront-Face

Page 59: Ultra-Low Power Render-Based Collision Detection for CPU ... · “CPU, GPU and screen are the dominant energy consumers on a smartphone”2 Graphics animation applications are quite

Enrique de Lucas 3. Render-Based CD in the GPU 59

Z-Overlap TestZEB

List-Register

Stack

...

=

=

=

...

Current

=

CurrentElement

Is FrontFace?

Push

Yes No

SearchFront-face

Hit0

Hit3

Hit2

...

Hit1 CollisionPair

Generator

FragmentsAbove Front-Face?

Is any fragmentabove it?

Collision!!!Generate

Collision Pairs

Yes No

Is FrontFace?

SearchFront-Face

Page 60: Ultra-Low Power Render-Based Collision Detection for CPU ... · “CPU, GPU and screen are the dominant energy consumers on a smartphone”2 Graphics animation applications are quite

Enrique de Lucas 3. Render-Based CD in the GPU 60

Z-Overlap TestZEB

List-Register

Stack

...

=

=

=

...

Current

=

CurrentElement

Is FrontFace?

Push

Yes No

SearchFront-face

Hit0

Hit3

Hit2

...

Hit1 CollisionPair

Generator

FragmentsAbove Front-Face?

Push

Is any fragmentabove it?

Collision!!!Generate

Collision Pairs

Yes No

SearchFront-Face

Page 61: Ultra-Low Power Render-Based Collision Detection for CPU ... · “CPU, GPU and screen are the dominant energy consumers on a smartphone”2 Graphics animation applications are quite

Enrique de Lucas 3. Render-Based CD in the GPU 61

Z-Overlap TestZEB

List-Register

Stack

...

=

=

=

...

Current

=

CurrentElement

Is FrontFace?

Push

Yes No

SearchFront-face

Hit0

Hit3

Hit2

...

Hit1 CollisionPair

Generator

FragmentsAbove Front-Face?

Is any fragmentabove it?

Collision!!!Generate

Collision Pairs

Yes No

SearchFront-Face

Page 62: Ultra-Low Power Render-Based Collision Detection for CPU ... · “CPU, GPU and screen are the dominant energy consumers on a smartphone”2 Graphics animation applications are quite

Enrique de Lucas 3. Render-Based CD in the GPU 62

Z-Overlap TestZEB

List-Register

Stack

...

=

=

=

...

Current

=

CurrentElement

Is FrontFace?

Push

Yes No

SearchFront-face

Hit0

Hit3

Hit2

...

Hit1 CollisionPair

Generator

FragmentsAbove Front-Face?

Is any fragmentabove it?

Collision!!!Generate

Collision Pairs

Yes No

Is FrontFace?

SearchFront-Face

Page 63: Ultra-Low Power Render-Based Collision Detection for CPU ... · “CPU, GPU and screen are the dominant energy consumers on a smartphone”2 Graphics animation applications are quite

Enrique de Lucas 3. Render-Based CD in the GPU 63

Z-Overlap TestZEB

List-Register

Stack

...

=

=

=

...

Current

=

CurrentElement

Is FrontFace?

Push

Yes No

SearchFront-face

Hit0

Hit3

Hit2

...

Hit1

0

0

1

0Collision

PairGenerator

FragmentsAbove Front-Face?

SearchFront-face

Is any fragmentabove it?

Collision!!!Generate

Collision Pairs

Yes No

SearchFront-Face

Page 64: Ultra-Low Power Render-Based Collision Detection for CPU ... · “CPU, GPU and screen are the dominant energy consumers on a smartphone”2 Graphics animation applications are quite

Enrique de Lucas 3. Render-Based CD in the GPU 64

Z-Overlap TestZEB

List-Register

Stack

...

=

=

=

...

Current

=

CurrentElement

Is FrontFace?

Push

Yes No

SearchFront-face

Hit0

Hit3

Hit2

...

Hit1

0

0

1

0Collision

PairGenerator

FragmentsAbove Front-Face?

SearchFront-face

Is any fragmentabove it?

Collision!!!Generate

Collision Pairs

Yes NoIs any fragmentabove it?

SearchFront-Face

Page 65: Ultra-Low Power Render-Based Collision Detection for CPU ... · “CPU, GPU and screen are the dominant energy consumers on a smartphone”2 Graphics animation applications are quite

Enrique de Lucas 3. Render-Based CD in the GPU 65

Z-Overlap TestZEB

List-Register

Stack

...

=

=

=

...

Current

=

CurrentElement

Is FrontFace?

Push

Yes No

SearchFront-face

Hit0

Hit3

Hit2

...

Hit1

0

0

1

0

0

1

0

0Collision

PairGenerator

FragmentsAbove Front-Face?

SearchFront-face

Is any fragmentabove it?

Collision!!!Generate

Collision Pairs

Yes NoIs any fragmentabove it?

SearchFront-Face

Page 66: Ultra-Low Power Render-Based Collision Detection for CPU ... · “CPU, GPU and screen are the dominant energy consumers on a smartphone”2 Graphics animation applications are quite

Enrique de Lucas 3. Render-Based CD in the GPU 66

Z-Overlap TestZEB

List-Register

Stack

...

=

=

=

...

Current

=

CurrentElement

Is FrontFace?

Push

Yes No

SearchFront-face

Hit0

Hit3

Hit2

...

Hit1

0

0

1

0

0

1

0

0Collision

PairGenerator

FragmentsAbove Front-Face?

SearchFront-face

Is any fragmentabove it?

Collision!!!Generate

Collision Pairs

Yes No

SearchFront-Face

Page 67: Ultra-Low Power Render-Based Collision Detection for CPU ... · “CPU, GPU and screen are the dominant energy consumers on a smartphone”2 Graphics animation applications are quite

Enrique de Lucas 3. Render-Based CD in the GPU 67

Z-Overlap TestZEB

List-Register

Stack

...

=

=

=

...

Current

=

CurrentElement

Is FrontFace?

Push

Yes No

SearchFront-face

Hit0

Hit3

Hit2

...

Hit1

0

0

1

0

0

1

0

0Collision

PairGenerator

FragmentsAbove Front-Face?

<red, blue>

SearchFront-face

Is any fragmentabove it?

Collision!!!Generate

Collision Pairs

Collision!!!Generate

Collision Pairs

Yes No

SearchFront-Face

Page 68: Ultra-Low Power Render-Based Collision Detection for CPU ... · “CPU, GPU and screen are the dominant energy consumers on a smartphone”2 Graphics animation applications are quite

Enrique de Lucas 3. Render-Based CD in the GPU 68

Z-Overlap TestZEB

List-Register

Stack

...

=

=

=

...

Current

=

CurrentElement

Is FrontFace?

Push

Yes No

SearchFront-face

Hit0

Hit3

Hit2

...

Hit1

0

0

1

0

0

1

0

0Collision

PairGenerator

FragmentsAbove Front-Face?

SearchFront-face

Is any fragmentabove it?

Collision!!!Generate

Collision Pairs

Collision!!!Generate

Collision Pairs

Yes No

SearchFront-Face

Page 69: Ultra-Low Power Render-Based Collision Detection for CPU ... · “CPU, GPU and screen are the dominant energy consumers on a smartphone”2 Graphics animation applications are quite

Enrique de Lucas 3. Render-Based CD in the GPU 69

Z-Overlap TestZEB

List-Register

Stack

...

=

=

=

...

Current

=

CurrentElement

Is FrontFace?

Push

Yes No

SearchFront-face

Hit0

Hit3

Hit2

...

Hit1 CollisionPair

Generator

FragmentsAbove Front-Face?

Is any fragmentabove it?

Collision!!!Generate

Collision Pairs

Yes No

SearchFront-Face

Page 70: Ultra-Low Power Render-Based Collision Detection for CPU ... · “CPU, GPU and screen are the dominant energy consumers on a smartphone”2 Graphics animation applications are quite

Enrique de Lucas 3. Render-Based CD in the GPU 70

Z-Overlap TestZEB

List-Register

Stack

...

=

=

=

...

Current

=

CurrentElement

Is FrontFace?

Push

Yes No

SearchFront-face

Hit0

Hit3

Hit2

...

Hit1 CollisionPair

Generator

FragmentsAbove Front-Face?

Is any fragmentabove it?

Collision!!!Generate

Collision Pairs

Yes No

Is FrontFace?

SearchFront-Face

Page 71: Ultra-Low Power Render-Based Collision Detection for CPU ... · “CPU, GPU and screen are the dominant energy consumers on a smartphone”2 Graphics animation applications are quite

Enrique de Lucas 3. Render-Based CD in the GPU 71

Z-Overlap TestZEB

List-Register

Stack

...

=

=

=

...

Current

=

CurrentElement

Is FrontFace?

Push

Yes No

SearchFront-face

Hit0

Hit3

Hit2

...

Hit1 CollisionPair

Generator

FragmentsAbove Front-Face?

SearchFront-face

Is any fragmentabove it?

Collision!!!Generate

Collision Pairs

Yes No

SearchFront-Face

Page 72: Ultra-Low Power Render-Based Collision Detection for CPU ... · “CPU, GPU and screen are the dominant energy consumers on a smartphone”2 Graphics animation applications are quite

Enrique de Lucas 3. Render-Based CD in the GPU 72

Z-Overlap TestZEB

List-Register

Stack

...

=

=

=

...

Current

=

CurrentElement

Is FrontFace?

Push

Yes No

SearchFront-face

Hit0

Hit3

Hit2

...

Hit1

0

1

0

0Collision

PairGenerator

FragmentsAbove Front-Face?

SearchFront-face

Is any fragmentabove it?

Collision!!!Generate

Collision Pairs

Yes No

SearchFront-Face

Page 73: Ultra-Low Power Render-Based Collision Detection for CPU ... · “CPU, GPU and screen are the dominant energy consumers on a smartphone”2 Graphics animation applications are quite

Enrique de Lucas 3. Render-Based CD in the GPU 73

Z-Overlap TestZEB

List-Register

Stack

...

=

=

=

...

Current

=

CurrentElement

Is FrontFace?

Push

Yes No

SearchFront-face

Hit0

Hit3

Hit2

...

Hit1

0

1

0

0Collision

PairGenerator

FragmentsAbove Front-Face?

Is any fragmentabove it?

Collision!!!Generate

Collision Pairs

Yes NoIs any fragmentabove it?

SearchFront-Face

Page 74: Ultra-Low Power Render-Based Collision Detection for CPU ... · “CPU, GPU and screen are the dominant energy consumers on a smartphone”2 Graphics animation applications are quite

Enrique de Lucas 3. Render-Based CD in the GPU 74

Z-Overlap TestZEB

List-Register

Stack

...

=

=

=

...

Current

=

CurrentElement

Is FrontFace?

Push

Yes No

SearchFront-face

Hit0

Hit3

Hit2

...

Hit1

0

1

0

0

0

0

0

0Collision

PairGenerator

FragmentsAbove Front-Face?

Is any fragmentabove it?

Collision!!!Generate

Collision Pairs

Yes NoIs any fragmentabove it?

SearchFront-Face

Page 75: Ultra-Low Power Render-Based Collision Detection for CPU ... · “CPU, GPU and screen are the dominant energy consumers on a smartphone”2 Graphics animation applications are quite

Enrique de Lucas 3. Render-Based CD in the GPU 75

Z-Overlap TestZEB

List-Register

Stack

...

=

=

=

...

Current

=

CurrentElement

Is FrontFace?

Push

Yes No

SearchFront-face

Hit0

Hit3

Hit2

...

Hit1 CollisionPair

Generator

FragmentsAbove Front-Face?

Is any fragmentabove it?

Collision!!!Generate

Collision Pairs

Yes No

SearchFront-Face

Page 76: Ultra-Low Power Render-Based Collision Detection for CPU ... · “CPU, GPU and screen are the dominant energy consumers on a smartphone”2 Graphics animation applications are quite

Enrique de Lucas 76

Outline

1. Motivation

2. Collision Detection (CD)

3. Render-Based CD in the GPU

4. Results

5. Conclusion

Page 77: Ultra-Low Power Render-Based Collision Detection for CPU ... · “CPU, GPU and screen are the dominant energy consumers on a smartphone”2 Graphics animation applications are quite

Enrique de Lucas 4. Results 77

Evaluation Methodology● TEAPOT simulation infrastructure

– Android and OpenGL ES– GPU timing simulator models:

● Tile-Based Rendering architecture (ARM Mali 400MP-like)

– GPU power model based on McPAT● Extended with RBCD unit

● Marss cycle accurate full system simulator● Workloads: 4 unmodified Android commercial games

Benchmark Type Donwnloads

Captain America beat'em up 10-50 M

Crazy Snowboard snowboard arcade 5-10 M

Sleepy Jack action 100-500 K

Temple Run adventure arcade 100-500 M

Page 78: Ultra-Low Power Render-Based Collision Detection for CPU ... · “CPU, GPU and screen are the dominant energy consumers on a smartphone”2 Graphics animation applications are quite

Enrique de Lucas 4. Results 78

Performance and Energy of CD

RBCD vs CD on a CPU

● 3400x speedup and 2875x energy reduction on average

– RBCD reuses rendering results

capcrazy

sleepytemple

geo.mean

0

1000

2000

3000

4000

5000

6000

Sp

ee

dup

capcrazy

sleepytemple

geo.mean

0

1000

2000

3000

4000

5000

6000

Ene

rgy

Re

duc

tion

Page 79: Ultra-Low Power Render-Based Collision Detection for CPU ... · “CPU, GPU and screen are the dominant energy consumers on a smartphone”2 Graphics animation applications are quite

Enrique de Lucas 4. Results 79

GPU Overhead

Row 1 Row 2 Row 3 Row 40

2

4

6

8

10

12

Column 1

Column 2

Column 3

capcrazy

sleepytemple

geo.mean

0

0.2

0.4

0.6

0.8

1

No

rmal

ize

d E

nerg

y

capcrazy

sleepytemple

geo.mean

0

0.2

0.4

0.6

0.8

1

No

rmal

ize

d T

ime

● Energy: 3.5%● Time overhead: 3%

● Area overhead: < 1%

Page 80: Ultra-Low Power Render-Based Collision Detection for CPU ... · “CPU, GPU and screen are the dominant energy consumers on a smartphone”2 Graphics animation applications are quite

Enrique de Lucas 4. Results 80

GPU Overhead: Face Culling

cap crazy sleepy temple geo.mean

Triangles Fragments

Nor

mal

ized

Rasterize 18% more triangles, 6% more fragments

FaceCulling

Rasterization

Fragments

FragmentProcessing

Common Face Culling:

FaceCulling

RasterizationFragments

FragmentProcessing

Fragments

CD

Deferred Face Culling:

Page 81: Ultra-Low Power Render-Based Collision Detection for CPU ... · “CPU, GPU and screen are the dominant energy consumers on a smartphone”2 Graphics animation applications are quite

Enrique de Lucas 5. Conclusions 81

Conclusions

● Energy budget limits the quality and realism of CD in mobile graphics animation applications

● Most of the computation required by Image-Based CD is already done in image rendering

● RBCD provides

– 2875x energy reduction– 3400x speedup– Pixel level accuracy

– Small overheads: time (3%), energy (3.5%) and area (1%)

● RBCD is a low-energy yet high-fidelity CD solution

Page 82: Ultra-Low Power Render-Based Collision Detection for CPU ... · “CPU, GPU and screen are the dominant energy consumers on a smartphone”2 Graphics animation applications are quite

48th International Symposium on Microarchitecture

Ultra-Low Power Render-Based Collision Detection

for CPU/GPU Systems

Enrique de Lucas Pedro Marcuello Joan-Manuel Parcerisa Antonio González