: A CUDA-Powered In-Memory OLAP Server · 2009-10-14 · Usage of different GPU memory types for storage and processing of data cubes: GPU Aggregation Algorithm 10.03 31.24 5.00 12.01

Processor 1 ...Processor 2 Processor M

Device

GlobalMemory

ConstantMemory

SharedMemory

Multiprocessor 1

Query

Dimensioncompression

Temporary data forquery processing

...

Multiprocessor 2

Multiprocessor N

+GPU: A CUDA-Powered In-Memory OLAP ServerTobias Lauer

University of FreiburgGeorges-Köhler-Allee 05179110 Freiburg, Germany

[email protected]

Amitava DattaUniversity of Western Australia

35 Stirling HighwayPerth, WA 6009, [email protected]

Zurab KhadikovJedox AG

Bismarckallee 7a79098 Freiburg, Germany

[email protected]

Online Analytical Processing (OLAP)OLAP is a core technology in Business Intelligence and Corporate Performance Management, allowing users to navigate and explore corporate data (usually extracted from a data warehouse) and to roll up or drill down along different hierarchical levels. Also, updates to the data must be supported for planning and forecasting. Due to the highly interactive nature of OLAP analysis, query performance is a key issue.

Data Storage on GPUUsage of different GPU memory types for storage and processing of data cubes:

GPU Aggregation Algorithm

10.03

31.24

5.00

12.017.90

11.23

56.78

1.85

2.04

1.11

2.44

25.00

0.9736.90

4.92

43.75

Cells 0

1

0

0

1

11

0

1

0

1

0

0

0

01

1

0

1

1

1

23

4

4

5

5

6

6

6

66

7

8

scan

10.03

31.24

5.00

12.01

7.9011.23

56.78

1.85

2.04

1.11

2.44

25.00

0.97

36.90

4.9243.75

<

<

<

<

<

<

<

<

<

<

<

<

<

<

<

<

10.03

12.01

7.90

11.23

1.851.11

4.92

43.75

reduce 92.80

consolidatemarkFlags Flags

Cells

Filteredcells

Result

Multidimensional AggregationThe conceptual model central to OLAP is the Data Cube, which is a view of the data as cells in a multidimensional table (“cube”). Aggregation along dimensional hierarchies is a basic building block involved in most OLAP operations. The main problems to solve in order to compute aggegrates efficiently is the sparsity of data in high-dimensional spaces and the dimensional “explosion”, since the number of possible aggregates grows exponentially. Our approach aims at speeding up aggregations by using the massively parallel architecture of GPUs.

Performance Evaluation

Cube 1 Cube 2 Cube N

. . .

. . .

Dimension compression

Dimensioncompression

Split into pages

ScalabilityThe proposed ap-proach scales well to multiple GPUs. All cube data are distributed among all available cards, ensuring that the workload is always divided evenly be-tween the devices. At the end, the in-dividual results are aggregated by the main thread on the host.

About PaloJedox AG, headquartered in Freiburg (Germany) with offices in Great Britain and France, is one of the leading suppliers of Open-source Business Intelligence and Corporate Performance Management solutions in Europe. Jedox’ core product, Palo BI Suite, accommodates the entire range of BI requirements including planning, reporting and analysis.

The multidimensional Palo OLAP Server at the core of the Palo BI Suite integrates simply and easily existing MS-Excel solutions and optimizes planning, reporting and analysis.

Single compute-intensive aggregation

Compute-intensive view (multiple aggregations)

time in seconds

time in seconds

Current and future work Optimization of bulk queries

Two-stage algorithm Pre-filtering

Efficient updates to database Spreading updated aggregate values to base facts Allocation of new records in GPU memory

Computation of advanced business rules Seamless integration of CUDA algorithms into Palo

Documents

: A CUDA-Powered In-Memory OLAP Server · 2009-10-14 · Usage of different GPU memory types for storage and processing of data cubes: GPU Aggregation Algorithm 10.03 31.24 5.00 12.01