13
An Epsilon Range Join in a graphics processing unit Project work of Timo Proescholdt

An Epsilon Range Join in a graphics processing unit

Embed Size (px)

DESCRIPTION

An Epsilon Range Join in a graphics processing unit. Project work of Timo Proescholdt. Motivation. Graphic processing units increasingly more powerfull Can we exploit this immense computing power to accelerate general purpose algorithms? Single instruction, multiple data concept - PowerPoint PPT Presentation

Citation preview

Page 1: An Epsilon Range Join in a graphics processing unit

An Epsilon Range Join in a graphics processing unit

Project work of Timo Proescholdt

Page 2: An Epsilon Range Join in a graphics processing unit

Motivation

• Graphic processing units increasingly more powerfull

• Can we exploit this immense computing power to accelerate general purpose algorithms?

• Single instruction, multiple data concept

• Bot Nvidia and ATI offer languages to write shader programs

Page 3: An Epsilon Range Join in a graphics processing unit

Project definition

“comparation of two implementations of a epsilon range join. One in

plain c++, the other implemented in a shader language”

Page 4: An Epsilon Range Join in a graphics processing unit

Epsilon Range Join?

For i in 0..Dataset.sizeFor j in i+1..Dataset.size

if Distance(j,i) < Epsilon

addResult(i,j)end

endend

i

j

Page 5: An Epsilon Range Join in a graphics processing unit

Steps undertaken

• Plain C++ implementation• Selection of a shader language (brook)

– Framework rather than language– CG based– Works with ATI and NVIDIA– Almost plain C programming

• Identifying math-intensive and paralllel components and moving them to GPU kernel functions

• Only computation intensive tasks in the GPU, controll remains on the CPU

Page 6: An Epsilon Range Join in a graphics processing unit

The GPU as workhorse

• Most computational intensive task is the calculation of the euclidian distance

• N*N/2-N= N(N/2-1) = 208.059.600 invocations (demo datas N is 20400)

• Highly parallel and independent from the rest of the results

• Implemented a kernel function which calculates the euclidian distance between two given records

Page 7: An Epsilon Range Join in a graphics processing unit

How to invoke the kernel function 208.059.600

times?• Call the kernel function with all the

necessary data and an iterator, stating the number of invocations

• Data is uploaded into the GPU memory• Function executed parallely • iterator argument embraces the

number of the actual invocation as its value

Page 8: An Epsilon Range Join in a graphics processing unit

Problem: a kernel function can only be invoked a ~4 millon

times

(and texture memory is limited to 2048x2048 textures) Solution:

• Split the whole data space into chunks(of size 2040)

• Kernel funcion joins two of these chunks (2040^2 ~= 4 millon)

• CPU controll function invokes kernel function for each chunk pair and assembles the total result from the partial results

i

j

2040

2040 N

N

Page 9: An Epsilon Range Join in a graphics processing unit

How to invoke the kernel function 208.059.600

times?

• Data1 and Data2 contain the chunks to be joined

• Entry point for Data1 is calculated from iterator ( iterator / Data1.size )

• Entry point for Data2 is calculated from iterator ( iterator mod Data2.size )

• Calculate distance and write it to result

void kernel workhorse( iterator, data1, data2, .. , result)

Page 10: An Epsilon Range Join in a graphics processing unit

Results

• GPU version of the algorithm outperforms the plain C++ version by the factor 5

• Runtime independent from the result

• Hardware: 3,4 Ghz Pentium4, 7800 GX

Page 11: An Epsilon Range Join in a graphics processing unit

Further work

• Kernel function returns chunksize^2 sized array, independently from the actual size of the result set

• Native CG version of the algorihm (brook runtime not performant)

• Pack algorithm into a DLL which can be linked against

• Make algoirthm work with non 2040 aligned input data

Page 12: An Epsilon Range Join in a graphics processing unit

Thanks to..

• Peter Kunath

• Prof. Dr. Christian Boehm

• And you, for your pacience!

Page 13: An Epsilon Range Join in a graphics processing unit

Questions?