Upload
kohei-kaigai
View
1.101
Download
1
Embed Size (px)
Citation preview
Self Introduction
▌Name: Wasserschwein@Shinagawa
▌PostgreSQL: 9Years (2006~)
▌Works: Security, FDW, etc...
▌Hobby: Mixture of heterogeneous technology with PostgreSQL
PostgreSQL Conference Japan - LT: SQL+GPU+SSD=∞ 2
Very powerful computing capability
Very functional & well-used
database PG-Strom:
What I’m making
GPGPU
What’s PG-Strom – Brief overview
PostgreSQL Conference Japan - LT: SQL+GPU+SSD=∞ 3
▌Core ideas
① GPU native code generation on the fly
② Asynchronous massive parallel execution
▌Advantages
Transparent acceleration with 100% query compatibility
Commodity H/W and less system integration cost
Parser
Planner
Executor
Custom- Scan/Join Interface
Query: SELECT * FROM l_tbl JOIN r_tbl on l_tbl.lid = r_tbl.rid;
PG-Strom
CUDA driver
nvrtc
DMA Data Transfer
CUDA Source code Massive
Parallel Execution
Supported Workload – Scan, Join, Aggregation
▌SELECT cat, AVG(x) FROM t0 NATURAL JOIN t1 [, ...] GROUP BY cat;
t0: 100M rows, t1~t10: 100K rows for each, all the data was preloaded.
▌Environment:
PostgreSQL v9.5beta1 + PG-Strom (22-Oct), CUDA 7.0 + RHEL6.6 (x86_64)
CPU: Xeon E5-2670v3, RAM: 384GB, GPU: NVIDIA TESLA K20c (2496cores)
PostgreSQL Conference Japan - LT: SQL+GPU+SSD=∞ 4
0
50
100
150
200
250
300
PgSQL Strom PgSQL Strom PgSQL Strom PgSQL Strom PgSQL Strom PgSQL Strom PgSQL Strom
2 3 4 5 6 7 8
Qu
ery
Res
po
nse
Tim
e [s
ec]
# of tables involved
Time consumption per component (PostgreSQL v9.5β vs PG-Strom)
Scan Join Aggregate Others
Next target is I/O acceleration – from TPC/DS results
PostgreSQL Conference Japan - LT: SQL+GPU+SSD=∞ 5
0%
10%
20%
30%
40%
50%
60%
70%
80%
90%
100%
Time consumption per workloads (PostgreSQL v9.5beta+PG-Strom)
Scan Join Aggregate Others
So, How to accelerate I/O stuff by GPU?
NOTICE
The story I like to introduce next is...
Just my Idea at this moment
......So, I’ll pay my efforts to implement
PostgreSQL Conference Japan - LT: SQL+GPU+SSD=∞ 6
A rough x86_64 hardware architecture
PostgreSQL Conference Japan - LT: SQL+GPU+SSD=∞ 7
GPU SSD
CPU + RAM CPU + RAM
PCI-E
SAS
Usual I/O bottleneck
Simplified diagram for introduction
PostgreSQL Conference Japan - LT: SQL+GPU+SSD=∞ 8
GPU SSD
CPU + RAM
PCI-E
OK, it’s storage
NVM EXPRESS SSD
PostgreSQL Conference Japan - LT: SQL+GPU+SSD=∞ 9
PCI-E direct SSD device – low latency and higher bandwidth
Samsung SSD 950 PRO
Intel SSD 750
HGST Ultrastar SN100
Intel SSD DC P3700
Data Flow in analytic queries
① Data load from storage to CPU/RAM
PostgreSQL Conference Japan - LT: SQL+GPU+SSD=∞ 10
GPU SSD
CPU + RAM
PCI-E
Table
Data Flow in analytic queries
① Data load from storage to CPU/RAM
② Remove invisible rows (Select)
PostgreSQL Conference Japan - LT: SQL+GPU+SSD=∞ 11
GPU SSD
CPU + RAM
PCI-E
Table
Data Flow in analytic queries
① Data load from storage to CPU/RAM
② Remove invisible rows (Select)
③ Remove unreferenced columns (Projection)
PostgreSQL Conference Japan - LT: SQL+GPU+SSD=∞ 12
GPU SSD
CPU + RAM
PCI-E
Table
The job of CPU
Data Flow in analytic queries
① Data load from storage to CPU/RAM
② Remove invisible rows (Select)
③ Remove unreferenced columns (Projection)
④ Join with other tables (Join)
PostgreSQL Conference Japan - LT: SQL+GPU+SSD=∞ 13
GPU SSD
CPU + RAM
PCI-E
Table
+
SSD-to-GPU Direct
PostgreSQL Conference Japan - LT: SQL+GPU+SSD=∞ 14
Data transfer between SSD and GPU, bypass CPU/RAM
Also available on NVMe, not only Fusion-IO
Data Flow in analytic queries (1/3) – Basic
PostgreSQL Conference Japan - LT: SQL+GPU+SSD=∞ 15
GPU SSD
CPU + RAM
PCI-E
Table
SSD-to-GPU Direct
Data Flow in analytic queries (1/3) – Basic
PostgreSQL Conference Japan - LT: SQL+GPU+SSD=∞ 16
GPU SSD
CPU + RAM
PCI-E
Table
GPU
code g
enera
ted
from
SQ
L o
n t
he f
ly
SSD-to-GPU Direct
Data Flow in analytic queries (1/3) – Basic
PostgreSQL Conference Japan - LT: SQL+GPU+SSD=∞ 17
GPU SSD
CPU + RAM
PCI-E
Table
GPU
code g
enera
ted
from
SQ
L o
n t
he f
ly
SSD-to-GPU Direct
Remove invisible rows according to the scan
qualifiers
Data Flow in analytic queries (1/3) – Basic
PostgreSQL Conference Japan - LT: SQL+GPU+SSD=∞ 18
GPU SSD
CPU + RAM
PCI-E
Table
GPU
code g
enera
ted
from
SQ
L o
n t
he f
ly
SSD-to-GPU Direct
Only visible rows are moved to CPU+RAM
Remove invisible rows according to the scan
qualifiers
Data Flow in analytic queries (2/3) – Advanced
PostgreSQL Conference Japan - LT: SQL+GPU+SSD=∞ 19
GPU SSD
CPU + RAM
PCI-E
Table
GPU
code g
enera
ted
from
SQ
L o
n t
he f
ly
SSD-to-GPU Direct
Data Flow in analytic queries (2/3) – Advanced
PostgreSQL Conference Japan - LT: SQL+GPU+SSD=∞ 20
GPU SSD
CPU + RAM
PCI-E
Table
GPU
code g
enera
ted
from
SQ
L o
n t
he f
ly
SSD-to-GPU Direct
Remove invisible rows according to the scan
qualifiers
Data Flow in analytic queries (2/3) – Advanced
PostgreSQL Conference Japan - LT: SQL+GPU+SSD=∞ 21
GPU SSD
CPU + RAM
PCI-E
Table
GPU
code g
enera
ted
from
SQ
L o
n t
he f
ly
SSD-to-GPU Direct
Only visible rows and referenced columns are
moved to CPU+RAM
Remove invisible rows according to the scan
qualifiers
Remove invisible rows and unreferenced
columns according to the scan qualifiers
and projection
Data Flow in analytic queries (3/3) – Ultimate
PostgreSQL Conference Japan - LT: SQL+GPU+SSD=∞ 22
GPU SSD
CPU + RAM
PCI-E
GPU
code g
enera
ted
from
SQ
L o
n t
he f
ly
Inner
rela
tions
(JO
IN t
arg
et)
Table
Data Flow in analytic queries (3/3) – Ultimate
PostgreSQL Conference Japan - LT: SQL+GPU+SSD=∞ 23
GPU SSD
CPU + RAM
PCI-E
Table
GPU
code g
enera
ted
from
SQ
L o
n t
he f
ly
SSD-to-GPU Direct
Inner
rela
tions
(JO
IN t
arg
et)
Data Flow in analytic queries (3/3) – Ultimate
PostgreSQL Conference Japan - LT: SQL+GPU+SSD=∞ 24
GPU SSD
CPU + RAM
PCI-E
Table
GPU
code g
enera
ted
from
SQ
L o
n t
he f
ly
SSD-to-GPU Direct
Inner
rela
tions
(JO
IN t
arg
et)
Data Flow in analytic queries (3/3) – Ultimate
PostgreSQL Conference Japan - LT: SQL+GPU+SSD=∞ 25
GPU SSD
CPU + RAM
PCI-E
Table
GPU
code g
enera
ted
from
SQ
L o
n t
he f
ly
SSD-to-GPU Direct
Tuples are already joined when it read
data from the storage
Inner
rela
tions
(JO
IN t
arg
et)
+
Generate joined tuples on GPU side
Primitive Technologies
▌NVIDIA GPUDirect enhancement on NVMe device driver
Interaction between NVMe and NVIDIA drivers are needed
▌Usage statistics of shared_buffers per relations
To avoid SSDGPU direct on relations that is already preloaded
▌Add new access mode to shared_buffers
Nobody can make the buffer dirty under the SSDGPU Direct transfer
We are welcome all the developer who join to PG-Strom project
PostgreSQL Conference Japan - LT: SQL+GPU+SSD=∞ 26