24
†Sogang University, Seoul, Republic of Korea, ‡SK hynix GPUKV: Towards a GPU-Driven Computing on Key-Value SSD Min-Gyo Jung†, Chang-Gyu Lee†, Donggyu Park†, Sungyong Park†, Youngjae Kim† Jungki Noh‡, Woosuk Chung‡, Kyoung Park‡

GPUKV Towards a GPU-Driven Computing on Key-Value SSD€¦ · Why is Key-Value Store + GPU important? Massive Parallelism Boost data-intensive applications Key-Value Store Good to

  • Upload
    others

  • View
    3

  • Download
    0

Embed Size (px)

Citation preview

Page 1: GPUKV Towards a GPU-Driven Computing on Key-Value SSD€¦ · Why is Key-Value Store + GPU important? Massive Parallelism Boost data-intensive applications Key-Value Store Good to

†Sogang University, Seoul, Republic of Korea, ‡SK hynix

GPUKV: Towards a GPU-Driven Computing on Key-Value SSD

Min-Gyo Jung†, Chang-Gyu Lee†, Donggyu Park†, Sungyong Park†, Youngjae Kim†Jungki Noh‡, Woosuk Chung‡, Kyoung Park‡

Page 2: GPUKV Towards a GPU-Driven Computing on Key-Value SSD€¦ · Why is Key-Value Store + GPU important? Massive Parallelism Boost data-intensive applications Key-Value Store Good to

Why is Key-Value Store + GPU important?

Massive Parallelism

Boost data-intensive applications

Key-Value StoreGood to store unstructured data

Widely used for storing big data

More powerful performance and usability for data-intensive applicationse.g. Map-Reduce, Graph Processing, Data Analysis …

GPU

Page 3: GPUKV Towards a GPU-Driven Computing on Key-Value SSD€¦ · Why is Key-Value Store + GPU important? Massive Parallelism Boost data-intensive applications Key-Value Store Good to

Host

UserSpace

KernelSpace

SSD ControllerSSD

Storage

Main MemoryGPU kernel

GPU Memory

GPU

Control Path

Data Path

RocksDBEngineUser

Space

File SystemKernelSpace

NVMe Driver

Application

Data Transfer Flow from Key-Value Store to GPU

Page 4: GPUKV Towards a GPU-Driven Computing on Key-Value SSD€¦ · Why is Key-Value Store + GPU important? Massive Parallelism Boost data-intensive applications Key-Value Store Good to

Data Transfer Flow from Key-Value Store to GPU

Host

UserSpace

KernelSpace

SSD ControllerSSD

Storage

Main MemoryGPU kernel

GPU Memory

GPU

Control Path

Data Path

RocksDBEngine

①UserSpace

File SystemKernelSpace

NVMe Driver

Application

Page 5: GPUKV Towards a GPU-Driven Computing on Key-Value SSD€¦ · Why is Key-Value Store + GPU important? Massive Parallelism Boost data-intensive applications Key-Value Store Good to

Data Transfer Flow from Key-Value Store to GPU

Host

UserSpace

KernelSpace

SSD ControllerSSD

Storage

Main MemoryGPU kernel

GPU Memory

GPU

Control Path

Data Path

RocksDBEngine

① ③UserSpace

File SystemKernelSpace

NVMe Driver

Application

Page 6: GPUKV Towards a GPU-Driven Computing on Key-Value SSD€¦ · Why is Key-Value Store + GPU important? Massive Parallelism Boost data-intensive applications Key-Value Store Good to

Data Transfer Flow from Key-Value Store to GPU

Host

UserSpace

KernelSpace

SSD ControllerSSD

Storage

Main MemoryGPU kernel

GPU Memory

GPU

Control Path

Data Path

RocksDBEngine

! "UserSpace

File SystemKernelSpace

NVMe Driver

#

!

Application

"

# $

$

Page 7: GPUKV Towards a GPU-Driven Computing on Key-Value SSD€¦ · Why is Key-Value Store + GPU important? Massive Parallelism Boost data-intensive applications Key-Value Store Good to

Data Transfer Flow from Key-Value Store to GPU

Host

UserSpace

KernelSpace

SSD ControllerSSD

Storage

Main MemoryGPU kernel

GPU Memory

GPU

Control Path

Data Path

RocksDBEngine

! "UserSpace

File SystemKernelSpace

NVMe Driver

#

!

Application

"

# $

$Extra data movement

Sophisticated control path

Page 8: GPUKV Towards a GPU-Driven Computing on Key-Value SSD€¦ · Why is Key-Value Store + GPU important? Massive Parallelism Boost data-intensive applications Key-Value Store Good to

Data Transfer Flow from Key-Value Store to GPU

Host

UserSpace

KernelSpace

SSD ControllerSSD

Storage

Main MemoryGPU kernel

GPU Memory

GPU

Control Path

Data Path

RocksDBEngine

! "UserSpace

File SystemKernelSpace

NVMe Driver

#

!

Application

"

# $

$

What if doing this using PCIe P2P transmission?

Page 9: GPUKV Towards a GPU-Driven Computing on Key-Value SSD€¦ · Why is Key-Value Store + GPU important? Massive Parallelism Boost data-intensive applications Key-Value Store Good to

Data Transfer Flow when transferring using P2P

Host

UserSpace

KernelSpace

SSD ControllerSSD

Storage

Main MemoryGPU kernel

GPU Memory

GPU

Control Path

Data PathAdditional Path

RocksDBEngineUser

Space

File SystemKernelSpace

NVMe Driver

Application

Page 10: GPUKV Towards a GPU-Driven Computing on Key-Value SSD€¦ · Why is Key-Value Store + GPU important? Massive Parallelism Boost data-intensive applications Key-Value Store Good to

Data Transfer Flow when transferring using P2P

Host

UserSpace

KernelSpace

SSD ControllerSSD

Storage

Main MemoryGPU kernel

GPU Memory

GPU

Control Path

Data PathAdditional Path

RocksDBEngine

UserSpace

File SystemKernelSpace

NVMe Driver ꇝ

Application

Page 11: GPUKV Towards a GPU-Driven Computing on Key-Value SSD€¦ · Why is Key-Value Store + GPU important? Massive Parallelism Boost data-intensive applications Key-Value Store Good to

Data Transfer Flow when transferring using P2P

Host

UserSpace

KernelSpace

SSD ControllerSSD

Storage

Main MemoryGPU kernel

GPU Memory

GPU

Control Path

Data PathAdditional Path

RocksDBEngine

UserSpace

File SystemKernelSpace

NVMe Driver ꇝ

Application

Page 12: GPUKV Towards a GPU-Driven Computing on Key-Value SSD€¦ · Why is Key-Value Store + GPU important? Massive Parallelism Boost data-intensive applications Key-Value Store Good to

Data Transfer Flow when transferring using P2P

Host

UserSpace

KernelSpace

SSD ControllerSSD

Storage

Main MemoryGPU kernel

GPU Memory

GPU

Control Path

Data PathAdditional Path

RocksDBEngine

UserSpace

File SystemKernelSpace

NVMe Driver ꇝ

Application

④ ꇞ

Page 13: GPUKV Towards a GPU-Driven Computing on Key-Value SSD€¦ · Why is Key-Value Store + GPU important? Massive Parallelism Boost data-intensive applications Key-Value Store Good to

Data Transfer Flow when transferring using P2P

Host

UserSpace

KernelSpace

SSD ControllerSSD

Storage

Main MemoryGPU kernel

GPU Memory

GPU

Control Path

Data PathAdditional Path

!

RocksDBEngine

" #

UserSpace

File SystemKernelSpace

NVMe Driver

$

!

Application

%

& "

#

Page 14: GPUKV Towards a GPU-Driven Computing on Key-Value SSD€¦ · Why is Key-Value Store + GPU important? Massive Parallelism Boost data-intensive applications Key-Value Store Good to

Data Transfer Flow when transferring using P2P

Host

UserSpace

KernelSpace

SSD ControllerSSD

Storage

Main MemoryGPU kernel

GPU Memory

GPU

Control Path

Data PathAdditional Path

!

RocksDBEngine

" #

UserSpace

File SystemKernelSpace

NVMe Driver

$

!

Application

%

& "

#Reduces data movement

More complicated control path

Data alignment for P2P

Page 15: GPUKV Towards a GPU-Driven Computing on Key-Value SSD€¦ · Why is Key-Value Store + GPU important? Massive Parallelism Boost data-intensive applications Key-Value Store Good to

What does GPUKV suppose to do?

§ GPU-driven computing model

• GPU issues IO bypassing host architectures

§ Reduce data movement using PCIe P2P

• Data storage ↔ Accelerator (GPU)

• Save wasting memory bus bandwidth

§ Simple control path

• Implementing Key-Value store at SSD,

reduce complex control paths

Page 16: GPUKV Towards a GPU-Driven Computing on Key-Value SSD€¦ · Why is Key-Value Store + GPU important? Massive Parallelism Boost data-intensive applications Key-Value Store Good to

Data Transfer Latency Breakdown

���������

� �

��������������

���������

������� ������������ � �������� !

�������������� �����

"

""

#""

$""

%""

Page 17: GPUKV Towards a GPU-Driven Computing on Key-Value SSD€¦ · Why is Key-Value Store + GPU important? Massive Parallelism Boost data-intensive applications Key-Value Store Good to

Data Transfer Latency Breakdown

���������

� �

��������������

���������

������� ������������ � �������� !

�������������� �����

"

""

#""

$""

%""

In ideal case, GPUKV only needs data transfer latency

Page 18: GPUKV Towards a GPU-Driven Computing on Key-Value SSD€¦ · Why is Key-Value Store + GPU important? Massive Parallelism Boost data-intensive applications Key-Value Store Good to

Data Transfer Latency Breakdown

���������

� �

��������������

���������

������� ������������ � �������� !

�������������� �����

"

""

#""

$""

%""

GPU-driven Computing is necessary!

Page 19: GPUKV Towards a GPU-Driven Computing on Key-Value SSD€¦ · Why is Key-Value Store + GPU important? Massive Parallelism Boost data-intensive applications Key-Value Store Good to

GPUKV’s Data Transfer Flow

Host

UserSpace

KernelSpace

SSD Controller

SSD

Storage

GPU kernel

GPU Memory

GPU

GPU Control Path

Data PathCPU Control Path

Application

Key-Value Driver

GPUKVDriver

No Redundant data copy

Simple and short Control Path

Data request from GPU itselfKey-Value

Page 20: GPUKV Towards a GPU-Driven Computing on Key-Value SSD€¦ · Why is Key-Value Store + GPU important? Massive Parallelism Boost data-intensive applications Key-Value Store Good to

Preliminary Results: Synthetic Workloads

§ Streaming workload (𝑊!"#$%&'())

• Predictable data access pattern

• The next dataset needed by GPU kernel can be prefetched

§ Dynamic workload (𝑊*+(%&',)

• Unpredictable data access pattern

• The next dataset GPU kernel needs cannot be prefetched

• Only can be loaded when current GPU kernel finishes.

Page 21: GPUKV Towards a GPU-Driven Computing on Key-Value SSD€¦ · Why is Key-Value Store + GPU important? Massive Parallelism Boost data-intensive applications Key-Value Store Good to

Preliminary Results: Synthetic Workloads

������

������������ ��

���

���

���

���

���

���

�������������������

���

�� ��������

�������

������������ ��

���

���

���

���

���

���

�������������

𝑊!"#$%&$'( 𝑊)"#*+'(

Page 22: GPUKV Towards a GPU-Driven Computing on Key-Value SSD€¦ · Why is Key-Value Store + GPU important? Massive Parallelism Boost data-intensive applications Key-Value Store Good to

Preliminary Results: Synthetic Workloads

������

������������ ��

���

���

���

���

���

���

�������������������

���

�� ��������

�������

������������ ��

���

���

���

���

���

���

�������������

Conventional way: Need powerful host resources

𝑊!"#$%&$'( 𝑊)"#*+'(

Page 23: GPUKV Towards a GPU-Driven Computing on Key-Value SSD€¦ · Why is Key-Value Store + GPU important? Massive Parallelism Boost data-intensive applications Key-Value Store Good to

Preliminary Results: Synthetic Workloads

������

������������ ��

���

���

���

���

���

���

�������������������

���

�� ��������

�������

������������ ��

���

���

���

���

���

���

�������������

Our approach – GPUKV: Always shows best performance with only 1 I/O threadBarely requires host resource

𝑊!"#$%&$'( 𝑊)"#*+'(

Page 24: GPUKV Towards a GPU-Driven Computing on Key-Value SSD€¦ · Why is Key-Value Store + GPU important? Massive Parallelism Boost data-intensive applications Key-Value Store Good to

†Sogang University, Seoul, Republic of Korea, ‡SK hynix

GPUKV: Towards a GPU-Driven Computing on Key-Value SSD

Min-Gyo Jung†, Chang-Gyu Lee†, Donggyu Park†, Sungyong Park†, Youngjae Kim†Jungki Noh‡, Woosuk Chung‡, Kyoung Park‡

[email protected]