28
Compression and SSDs: Where and How? Aviad Zuck, Sivan Toledo, Dmitry Sotnikov, Danny Harnik

Compression and SSDs: Where and How?...Compression and SSDs: Where and How? Aviad Zuck, Sivan Toledo, Dmitry Sotnikov, Danny Harnik . outline •Introduction •Compression and SSDs

  • Upload
    others

  • View
    17

  • Download
    2

Embed Size (px)

Citation preview

Page 1: Compression and SSDs: Where and How?...Compression and SSDs: Where and How? Aviad Zuck, Sivan Toledo, Dmitry Sotnikov, Danny Harnik . outline •Introduction •Compression and SSDs

Compression and SSDs: Where and How?

Aviad Zuck, Sivan Toledo, Dmitry Sotnikov, Danny Harnik

Page 2: Compression and SSDs: Where and How?...Compression and SSDs: Where and How? Aviad Zuck, Sivan Toledo, Dmitry Sotnikov, Danny Harnik . outline •Introduction •Compression and SSDs

outline

• Introduction

• Compression and SSDs – Where and how

– Typical use case

• Evaluation

• Conclusion

2

Page 3: Compression and SSDs: Where and How?...Compression and SSDs: Where and How? Aviad Zuck, Sivan Toledo, Dmitry Sotnikov, Danny Harnik . outline •Introduction •Compression and SSDs

why compress data

data

Databases, HPC, desktop apps, virtual machines…

More capacity

less data transferred to the device via interconnect Reduced latency, improved throughput

For SSDs less writes equals less wear Extended lifetime

3

(or) free space beneficial to COW and log-structured systems (e.g. SSDs)

Page 4: Compression and SSDs: Where and How?...Compression and SSDs: Where and How? Aviad Zuck, Sivan Toledo, Dmitry Sotnikov, Danny Harnik . outline •Introduction •Compression and SSDs

concerns of compression disks (mostly) read/write

4KB units

so do file-systems and operating systems

compression turns 4KB-aligned data to variable-

sized chunks

may need more complex data structures

CPU works harder

compression

4

Page 5: Compression and SSDs: Where and How?...Compression and SSDs: Where and How? Aviad Zuck, Sivan Toledo, Dmitry Sotnikov, Danny Harnik . outline •Introduction •Compression and SSDs

file systems and applications often try to compress larger chunks

potentially at the cost of increased accesses to disk

(increase SSD wear)

redundant reads and read-modify-write

of entire chunk

compress 4KB at a time

compress all together

compression improves as chunk size increases

5

Page 6: Compression and SSDs: Where and How?...Compression and SSDs: Where and How? Aviad Zuck, Sivan Toledo, Dmitry Sotnikov, Danny Harnik . outline •Introduction •Compression and SSDs

outline

• Introduction

• Compression and SSDs – Where and how

– Typical use case

• Evaluation

• Conclusion

6

Page 7: Compression and SSDs: Where and How?...Compression and SSDs: Where and How? Aviad Zuck, Sivan Toledo, Dmitry Sotnikov, Danny Harnik . outline •Introduction •Compression and SSDs

where and how

• Where should we place compression?

– Host or device?

• How should we do it?

– What granularity?

– Which layout?

7

Page 8: Compression and SSDs: Where and How?...Compression and SSDs: Where and How? Aviad Zuck, Sivan Toledo, Dmitry Sotnikov, Danny Harnik . outline •Introduction •Compression and SSDs

where to compress

• Compression can be enabled in different levels:

– application (database)

– file system

– SSD FTL (embedded in device)

8

Page 9: Compression and SSDs: Where and How?...Compression and SSDs: Where and How? Aviad Zuck, Sivan Toledo, Dmitry Sotnikov, Danny Harnik . outline •Introduction •Compression and SSDs

SSD use cases

• SSDs vs. magnetic disks (HDDs) – faster random accesses

– but more expensive

• Mostly used for – OLTP = OnLine Transaction Processing

(MySQL, Oracle)

– Virtual Machines (Amazon AWS)

– HPC metadata

– Caching solutions

9

Known to generate highly-compressible data

(2-5x compressible)

Page 10: Compression and SSDs: Where and How?...Compression and SSDs: Where and How? Aviad Zuck, Sivan Toledo, Dmitry Sotnikov, Danny Harnik . outline •Introduction •Compression and SSDs

mysql* database compression data stored in fixed-size B-tree pages (e.g. 16KB)

in-page modifications log

when log is full, re-compress entire page

* most popular freely-available open-source database system.

10

compress to smaller fixed-size compressed page (e.g. 4K, 8K)

Page 11: Compression and SSDs: Where and How?...Compression and SSDs: Where and How? Aviad Zuck, Sivan Toledo, Dmitry Sotnikov, Danny Harnik . outline •Introduction •Compression and SSDs

padding and unutilized space reduces compression gains

may cause read-modify-write behavior

11

fixed-size compression pages compression failures

logical view

physical view

Page 12: Compression and SSDs: Where and How?...Compression and SSDs: Where and How? Aviad Zuck, Sivan Toledo, Dmitry Sotnikov, Danny Harnik . outline •Introduction •Compression and SSDs

file system compression

• Btrfs – B-tree file system. In Linux kernel, still considered experimental

• ZFS – focused on protecting against data corruption (more conservative and robust)

• ext4 – most popular Linux file system

• ZFS/Btrfs are copy-on-write file systems

– Support compression (easier with COW)

– Aligned storage units

12

Page 13: Compression and SSDs: Where and How?...Compression and SSDs: Where and How? Aviad Zuck, Sivan Toledo, Dmitry Sotnikov, Danny Harnik . outline •Introduction •Compression and SSDs

• COW good for compression, but causes fragmentation

– Splitting + recompressing storage units

• Read-modify-copy still possible

– Records/extents not always fragmented

• Unutilized space

13

Page 14: Compression and SSDs: Where and How?...Compression and SSDs: Where and How? Aviad Zuck, Sivan Toledo, Dmitry Sotnikov, Danny Harnik . outline •Introduction •Compression and SSDs

brief summary

Read-modify-write behavior

Unutilized space

Fragmentation

14

Page 15: Compression and SSDs: Where and How?...Compression and SSDs: Where and How? Aviad Zuck, Sivan Toledo, Dmitry Sotnikov, Danny Harnik . outline •Introduction •Compression and SSDs

15

"All problems in computer science can be solved by another level of indirection… David Wheeler

except of course for the problem of too many indirections“

Page 16: Compression and SSDs: Where and How?...Compression and SSDs: Where and How? Aviad Zuck, Sivan Toledo, Dmitry Sotnikov, Danny Harnik . outline •Introduction •Compression and SSDs

intra-SSD compression

FTLs utilize indirection anyway (fixed-size)

Re-use mapping table for compression

– Buffer write requests

– Compress

– Dump compressed data to flash

– Update mapping

16

How? four possible packing schemes

Page 17: Compression and SSDs: Where and How?...Compression and SSDs: Where and How? Aviad Zuck, Sivan Toledo, Dmitry Sotnikov, Danny Harnik . outline •Introduction •Compression and SSDs

chunk-based

A1

time

A2 A3 A4

A1 A2 A3 A4

buffer in RAM

dump to flash

(A1+A2+A3+A4)’

compress as single large chunk…

17

Page 18: Compression and SSDs: Where and How?...Compression and SSDs: Where and How? Aviad Zuck, Sivan Toledo, Dmitry Sotnikov, Danny Harnik . outline •Introduction •Compression and SSDs

A2’

binpacking

A1

time

A2 A3 A4

A1’ A3’ A4’

compress & buffer in RAM

dump to flash

dump first buffer to make room

A1 A2’

A4’`

18

Page 19: Compression and SSDs: Where and How?...Compression and SSDs: Where and How? Aviad Zuck, Sivan Toledo, Dmitry Sotnikov, Danny Harnik . outline •Introduction •Compression and SSDs

re-ordering

A1

time

A2 A3 A4

A1’ A2’

A3’ A4’

compress & buffer in RAM

re-buffer

sort & re-order A3’ A1’ A4’

A2’

A3’ A1’ A4’ A2’

19

Page 20: Compression and SSDs: Where and How?...Compression and SSDs: Where and How? Aviad Zuck, Sivan Toledo, Dmitry Sotnikov, Danny Harnik . outline •Introduction •Compression and SSDs

A2’

compaction

A1

time

A2 A3 A4

A1’ A4’

compress & buffer in RAM

A3’

20

Page 21: Compression and SSDs: Where and How?...Compression and SSDs: Where and How? Aviad Zuck, Sivan Toledo, Dmitry Sotnikov, Danny Harnik . outline •Introduction •Compression and SSDs

outline

• Introduction

• Compression and SSDs – Where and how

– Typical use case

• Evaluation

• Conclusion

21

Page 22: Compression and SSDs: Where and How?...Compression and SSDs: Where and How? Aviad Zuck, Sivan Toledo, Dmitry Sotnikov, Danny Harnik . outline •Introduction •Compression and SSDs

methodology

• 4GB SSD

– VSSIM SSD emulator

– 4KB pages, 64 pages/block

– 50/250us read/write latency

• TPC-C workload (OLTP)

– Modified to provide three levels of compressibility (high/medium/low)

22

repeated with compression enabled in every level

QEMU Guest OS

File system

Block layer

VSSIM SSD Module

Latency

Manager

IDE Interface

QEMU Ram disk

Page 23: Compression and SSDs: Where and How?...Compression and SSDs: Where and How? Aviad Zuck, Sivan Toledo, Dmitry Sotnikov, Danny Harnik . outline •Introduction •Compression and SSDs

020406080100120140160180200

0

10

20

30

40

50

60

ext

4

zfs

zfs+

com

p

btr

fs

btr

fs+co

mp

ext

4

zfs

zfs+

com

p

btr

fs

btr

fs+co

mp

ext

4

zfs

zfs+

com

p

btr

fs

btr

fs+co

mp

high medium low

reads/ tx writes/ tx tx/s

file-system compression

• ext4 w/o compression yields best results • MySQL+compression much worse in all configurations

23

Page 24: Compression and SSDs: Where and How?...Compression and SSDs: Where and How? Aviad Zuck, Sivan Toledo, Dmitry Sotnikov, Danny Harnik . outline •Introduction •Compression and SSDs

embedded compression gains

• Total physical writes vs. FTL w/o compression • Compact and re-ordering schemes deliver similar performance

– re-ordering requires 30% less RAM for mapping data structures

performance (tx/s) improved by 10-15% vs. no compression

24

0%

10%

20%

30%

40%

50%

60%

HighComp MedComp LowComp

com

pre

ssio

n g

ain

compact

chunk4

chunk8

bp32

re-bp32

Page 25: Compression and SSDs: Where and How?...Compression and SSDs: Where and How? Aviad Zuck, Sivan Toledo, Dmitry Sotnikov, Danny Harnik . outline •Introduction •Compression and SSDs

compression hardware

0

100

200

300

400

500

600

700

800

900

compact chunk4 chunk8 bp32 re-bp32

MB

/scompress

decompress

25

More expensive compression HW

Page 26: Compression and SSDs: Where and How?...Compression and SSDs: Where and How? Aviad Zuck, Sivan Toledo, Dmitry Sotnikov, Danny Harnik . outline •Introduction •Compression and SSDs

outline

• Introduction

• Compression and SSDs – Where and how

– Typical use case

• Evaluation

• Conclusion

26

Page 27: Compression and SSDs: Where and How?...Compression and SSDs: Where and How? Aviad Zuck, Sivan Toledo, Dmitry Sotnikov, Danny Harnik . outline •Introduction •Compression and SSDs

conclusion

• Intra-SSD compression superior to using it in the existing application and file systems

– improvement vs. popular database application and file systems in OLTP workload

• Compressing OLTP workloads in larger chunks not always better

• Enhanced re-ordering scheme delivers optimal improvement using 30% less RAM requirements than new compact scheme

27

Page 28: Compression and SSDs: Where and How?...Compression and SSDs: Where and How? Aviad Zuck, Sivan Toledo, Dmitry Sotnikov, Danny Harnik . outline •Introduction •Compression and SSDs

questions?

28

Aviad Zuck, Sivan Toledo, Dmitry Sotnikov, Danny Harnik

(http://www.tau.ac.il/~aviadzuc)

INFLOW 2014