27
SQL+GPU+SSD=∞ Wasserschwein@Shinagawa

SQL+GPU+SSD=∞ (English)

Embed Size (px)

Citation preview

SQL+GPU+SSD=∞

Wasserschwein@Shinagawa

Self Introduction

▌Name: Wasserschwein@Shinagawa

▌PostgreSQL: 9Years (2006~)

▌Works: Security, FDW, etc...

▌Hobby: Mixture of heterogeneous technology with PostgreSQL

PostgreSQL Conference Japan - LT: SQL+GPU+SSD=∞ 2

Very powerful computing capability

Very functional & well-used

database PG-Strom:

What I’m making

GPGPU

What’s PG-Strom – Brief overview

PostgreSQL Conference Japan - LT: SQL+GPU+SSD=∞ 3

▌Core ideas

① GPU native code generation on the fly

② Asynchronous massive parallel execution

▌Advantages

Transparent acceleration with 100% query compatibility

Commodity H/W and less system integration cost

Parser

Planner

Executor

Custom- Scan/Join Interface

Query: SELECT * FROM l_tbl JOIN r_tbl on l_tbl.lid = r_tbl.rid;

PG-Strom

CUDA driver

nvrtc

DMA Data Transfer

CUDA Source code Massive

Parallel Execution

Supported Workload – Scan, Join, Aggregation

▌SELECT cat, AVG(x) FROM t0 NATURAL JOIN t1 [, ...] GROUP BY cat;

t0: 100M rows, t1~t10: 100K rows for each, all the data was preloaded.

▌Environment:

PostgreSQL v9.5beta1 + PG-Strom (22-Oct), CUDA 7.0 + RHEL6.6 (x86_64)

CPU: Xeon E5-2670v3, RAM: 384GB, GPU: NVIDIA TESLA K20c (2496cores)

PostgreSQL Conference Japan - LT: SQL+GPU+SSD=∞ 4

0

50

100

150

200

250

300

PgSQL Strom PgSQL Strom PgSQL Strom PgSQL Strom PgSQL Strom PgSQL Strom PgSQL Strom

2 3 4 5 6 7 8

Qu

ery

Res

po

nse

Tim

e [s

ec]

# of tables involved

Time consumption per component (PostgreSQL v9.5β vs PG-Strom)

Scan Join Aggregate Others

Next target is I/O acceleration – from TPC/DS results

PostgreSQL Conference Japan - LT: SQL+GPU+SSD=∞ 5

0%

10%

20%

30%

40%

50%

60%

70%

80%

90%

100%

Time consumption per workloads (PostgreSQL v9.5beta+PG-Strom)

Scan Join Aggregate Others

So, How to accelerate I/O stuff by GPU?

NOTICE

The story I like to introduce next is...

Just my Idea at this moment

......So, I’ll pay my efforts to implement

PostgreSQL Conference Japan - LT: SQL+GPU+SSD=∞ 6

A rough x86_64 hardware architecture

PostgreSQL Conference Japan - LT: SQL+GPU+SSD=∞ 7

GPU SSD

CPU + RAM CPU + RAM

PCI-E

SAS

Usual I/O bottleneck

Simplified diagram for introduction

PostgreSQL Conference Japan - LT: SQL+GPU+SSD=∞ 8

GPU SSD

CPU + RAM

PCI-E

OK, it’s storage

NVM EXPRESS SSD

PostgreSQL Conference Japan - LT: SQL+GPU+SSD=∞ 9

PCI-E direct SSD device – low latency and higher bandwidth

Samsung SSD 950 PRO

Intel SSD 750

HGST Ultrastar SN100

Intel SSD DC P3700

Data Flow in analytic queries

① Data load from storage to CPU/RAM

PostgreSQL Conference Japan - LT: SQL+GPU+SSD=∞ 10

GPU SSD

CPU + RAM

PCI-E

Table

Data Flow in analytic queries

① Data load from storage to CPU/RAM

② Remove invisible rows (Select)

PostgreSQL Conference Japan - LT: SQL+GPU+SSD=∞ 11

GPU SSD

CPU + RAM

PCI-E

Table

Data Flow in analytic queries

① Data load from storage to CPU/RAM

② Remove invisible rows (Select)

③ Remove unreferenced columns (Projection)

PostgreSQL Conference Japan - LT: SQL+GPU+SSD=∞ 12

GPU SSD

CPU + RAM

PCI-E

Table

The job of CPU

Data Flow in analytic queries

① Data load from storage to CPU/RAM

② Remove invisible rows (Select)

③ Remove unreferenced columns (Projection)

④ Join with other tables (Join)

PostgreSQL Conference Japan - LT: SQL+GPU+SSD=∞ 13

GPU SSD

CPU + RAM

PCI-E

Table

+

SSD-to-GPU Direct

PostgreSQL Conference Japan - LT: SQL+GPU+SSD=∞ 14

Data transfer between SSD and GPU, bypass CPU/RAM

Also available on NVMe, not only Fusion-IO

Data Flow in analytic queries (1/3) – Basic

PostgreSQL Conference Japan - LT: SQL+GPU+SSD=∞ 15

GPU SSD

CPU + RAM

PCI-E

Table

SSD-to-GPU Direct

Data Flow in analytic queries (1/3) – Basic

PostgreSQL Conference Japan - LT: SQL+GPU+SSD=∞ 16

GPU SSD

CPU + RAM

PCI-E

Table

GPU

code g

enera

ted

from

SQ

L o

n t

he f

ly

SSD-to-GPU Direct

Data Flow in analytic queries (1/3) – Basic

PostgreSQL Conference Japan - LT: SQL+GPU+SSD=∞ 17

GPU SSD

CPU + RAM

PCI-E

Table

GPU

code g

enera

ted

from

SQ

L o

n t

he f

ly

SSD-to-GPU Direct

Remove invisible rows according to the scan

qualifiers

Data Flow in analytic queries (1/3) – Basic

PostgreSQL Conference Japan - LT: SQL+GPU+SSD=∞ 18

GPU SSD

CPU + RAM

PCI-E

Table

GPU

code g

enera

ted

from

SQ

L o

n t

he f

ly

SSD-to-GPU Direct

Only visible rows are moved to CPU+RAM

Remove invisible rows according to the scan

qualifiers

Data Flow in analytic queries (2/3) – Advanced

PostgreSQL Conference Japan - LT: SQL+GPU+SSD=∞ 19

GPU SSD

CPU + RAM

PCI-E

Table

GPU

code g

enera

ted

from

SQ

L o

n t

he f

ly

SSD-to-GPU Direct

Data Flow in analytic queries (2/3) – Advanced

PostgreSQL Conference Japan - LT: SQL+GPU+SSD=∞ 20

GPU SSD

CPU + RAM

PCI-E

Table

GPU

code g

enera

ted

from

SQ

L o

n t

he f

ly

SSD-to-GPU Direct

Remove invisible rows according to the scan

qualifiers

Data Flow in analytic queries (2/3) – Advanced

PostgreSQL Conference Japan - LT: SQL+GPU+SSD=∞ 21

GPU SSD

CPU + RAM

PCI-E

Table

GPU

code g

enera

ted

from

SQ

L o

n t

he f

ly

SSD-to-GPU Direct

Only visible rows and referenced columns are

moved to CPU+RAM

Remove invisible rows according to the scan

qualifiers

Remove invisible rows and unreferenced

columns according to the scan qualifiers

and projection

Data Flow in analytic queries (3/3) – Ultimate

PostgreSQL Conference Japan - LT: SQL+GPU+SSD=∞ 22

GPU SSD

CPU + RAM

PCI-E

GPU

code g

enera

ted

from

SQ

L o

n t

he f

ly

Inner

rela

tions

(JO

IN t

arg

et)

Table

Data Flow in analytic queries (3/3) – Ultimate

PostgreSQL Conference Japan - LT: SQL+GPU+SSD=∞ 23

GPU SSD

CPU + RAM

PCI-E

Table

GPU

code g

enera

ted

from

SQ

L o

n t

he f

ly

SSD-to-GPU Direct

Inner

rela

tions

(JO

IN t

arg

et)

Data Flow in analytic queries (3/3) – Ultimate

PostgreSQL Conference Japan - LT: SQL+GPU+SSD=∞ 24

GPU SSD

CPU + RAM

PCI-E

Table

GPU

code g

enera

ted

from

SQ

L o

n t

he f

ly

SSD-to-GPU Direct

Inner

rela

tions

(JO

IN t

arg

et)

Data Flow in analytic queries (3/3) – Ultimate

PostgreSQL Conference Japan - LT: SQL+GPU+SSD=∞ 25

GPU SSD

CPU + RAM

PCI-E

Table

GPU

code g

enera

ted

from

SQ

L o

n t

he f

ly

SSD-to-GPU Direct

Tuples are already joined when it read

data from the storage

Inner

rela

tions

(JO

IN t

arg

et)

+

Generate joined tuples on GPU side

Primitive Technologies

▌NVIDIA GPUDirect enhancement on NVMe device driver

Interaction between NVMe and NVIDIA drivers are needed

▌Usage statistics of shared_buffers per relations

To avoid SSDGPU direct on relations that is already preloaded

▌Add new access mode to shared_buffers

Nobody can make the buffer dirty under the SSDGPU Direct transfer

We are welcome all the developer who join to PG-Strom project

PostgreSQL Conference Japan - LT: SQL+GPU+SSD=∞ 26

Coming Soon?