109
SplitFS: Reducing Software Overhead in File Systems for Persistent Memory Rohan Kadekodi, Se Kwon Lee, Sanidhya Kashyap*, Taesoo Kim, Aasheesh Kolli, Vijay Chidambaram * on the job market 1

SplitFS: Reducing Software Overhead in File Systems for ...vijay/papers/sosp19-splitfs-slides.pdf · SplitFS: Reducing Software Overhead in File Systems for Persistent Memory Rohan

  • Upload
    others

  • View
    2

  • Download
    0

Embed Size (px)

Citation preview

Page 1: SplitFS: Reducing Software Overhead in File Systems for ...vijay/papers/sosp19-splitfs-slides.pdf · SplitFS: Reducing Software Overhead in File Systems for Persistent Memory Rohan

SplitFS: Reducing Software Overhead in File Systems for Persistent Memory

Rohan Kadekodi, Se Kwon Lee, Sanidhya Kashyap*, Taesoo Kim, Aasheesh Kolli, Vijay Chidambaram

* on the job market�1

Page 2: SplitFS: Reducing Software Overhead in File Systems for ...vijay/papers/sosp19-splitfs-slides.pdf · SplitFS: Reducing Software Overhead in File Systems for Persistent Memory Rohan

Persistent Memory (PM)

Non-volatile Fast

�2

Page 3: SplitFS: Reducing Software Overhead in File Systems for ...vijay/papers/sosp19-splitfs-slides.pdf · SplitFS: Reducing Software Overhead in File Systems for Persistent Memory Rohan

PM file systems

�3

Page 4: SplitFS: Reducing Software Overhead in File Systems for ...vijay/papers/sosp19-splitfs-slides.pdf · SplitFS: Reducing Software Overhead in File Systems for Persistent Memory Rohan

PM file system

PM file systemsApplication

File 1PM File 2 File n

�3

Page 5: SplitFS: Reducing Software Overhead in File Systems for ...vijay/papers/sosp19-splitfs-slides.pdf · SplitFS: Reducing Software Overhead in File Systems for Persistent Memory Rohan

PM file system

PM file systemsApplication

File 1PM File 2 File n

read(), write(), open(), close()POSIX API

�3

Page 6: SplitFS: Reducing Software Overhead in File Systems for ...vijay/papers/sosp19-splitfs-slides.pdf · SplitFS: Reducing Software Overhead in File Systems for Persistent Memory Rohan

PM file system

PM file systemsApplication

File 1PM File 2 File n

read(), write(), open(), close()POSIX API

ext4-DAX PMFS NOVA Strata

�3

Page 7: SplitFS: Reducing Software Overhead in File Systems for ...vijay/papers/sosp19-splitfs-slides.pdf · SplitFS: Reducing Software Overhead in File Systems for Persistent Memory Rohan

ext4-DAX

700

�4

Page 8: SplitFS: Reducing Software Overhead in File Systems for ...vijay/papers/sosp19-splitfs-slides.pdf · SplitFS: Reducing Software Overhead in File Systems for Persistent Memory Rohan

ext4-DAX

700

�4

Modification of the ext4 file system for Persistent Memory

Works with modern Linux kernels

Under active development by the ext4 community

Only PM file system that is widely used

Page 9: SplitFS: Reducing Software Overhead in File Systems for ...vijay/papers/sosp19-splitfs-slides.pdf · SplitFS: Reducing Software Overhead in File Systems for Persistent Memory Rohan

Software Overhead in File Systems

700

• Append 4KB data to a file

0

2000

4000

6000

8000

10000

device Strata NOVA PMFS ext4-DAX

Tim

e (n

s)

�5

Page 10: SplitFS: Reducing Software Overhead in File Systems for ...vijay/papers/sosp19-splitfs-slides.pdf · SplitFS: Reducing Software Overhead in File Systems for Persistent Memory Rohan

Software Overhead in File Systems

700

Tim

e (n

s)

�5

0

2000

4000

6000

8000

10000

device Strata NOVA PMFS ext4-DAX

700

• Append 4KB data to a file • Time taken to copy user data to PM: ~700 ns

Page 11: SplitFS: Reducing Software Overhead in File Systems for ...vijay/papers/sosp19-splitfs-slides.pdf · SplitFS: Reducing Software Overhead in File Systems for Persistent Memory Rohan

Software Overhead in File Systems

700

Tim

e (n

s)

�5

• Append 4KB data to a file • Time taken to copy user data to PM: ~700 ns

0

2000

4000

6000

8000

10000

device Strata NOVA PMFS ext4-DAX

700

9002 (12x)

Page 12: SplitFS: Reducing Software Overhead in File Systems for ...vijay/papers/sosp19-splitfs-slides.pdf · SplitFS: Reducing Software Overhead in File Systems for Persistent Memory Rohan

Software Overhead in File Systems

700

Tim

e (n

s)

�5

• Append 4KB data to a file • Time taken to copy user data to PM: ~700 ns

0

2000

4000

6000

8000

10000

device Strata NOVA PMFS ext4-DAX

700

9002 (12x)

software overhead

Page 13: SplitFS: Reducing Software Overhead in File Systems for ...vijay/papers/sosp19-splitfs-slides.pdf · SplitFS: Reducing Software Overhead in File Systems for Persistent Memory Rohan

Software Overhead in File Systems

700

Tim

e (n

s)

�5

• Append 4KB data to a file • Time taken to copy user data to PM: ~700 ns

0

2000

4000

6000

8000

10000

device Strata NOVA PMFS ext4-DAX

4150 (5x)

3021 (3x)

700

2450 (2.5x)

9002 (12x)

Page 14: SplitFS: Reducing Software Overhead in File Systems for ...vijay/papers/sosp19-splitfs-slides.pdf · SplitFS: Reducing Software Overhead in File Systems for Persistent Memory Rohan

Software Overhead in File Systems

700

Tim

e (n

s)

�5

• Append 4KB data to a file • Time taken to copy user data to PM: ~700 ns

0

2000

4000

6000

8000

10000

device Strata NOVA PMFS ext4-DAX

4150 (5x)

3021 (3x)

700

2450 (2.5x)

9002 (12x)

File systems suffer from high software overhead!

Page 15: SplitFS: Reducing Software Overhead in File Systems for ...vijay/papers/sosp19-splitfs-slides.pdf · SplitFS: Reducing Software Overhead in File Systems for Persistent Memory Rohan

Software Overhead in File Systems

700

Tim

e (n

s)

�5

• Append 4KB data to a file • Time taken to copy user data to PM: ~700 ns

0

2000

4000

6000

8000

10000

device Strata NOVA PMFS ext4-DAX

4150 (5x)

3021 (3x)

700

2450 (2.5x)

9002 (12x)

File systems suffer from high software overhead!

ext4-DAX, although widely used, suffers from highest software overhead and provides weak guarantees

Page 16: SplitFS: Reducing Software Overhead in File Systems for ...vijay/papers/sosp19-splitfs-slides.pdf · SplitFS: Reducing Software Overhead in File Systems for Persistent Memory Rohan

�6

Goals

Page 17: SplitFS: Reducing Software Overhead in File Systems for ...vijay/papers/sosp19-splitfs-slides.pdf · SplitFS: Reducing Software Overhead in File Systems for Persistent Memory Rohan

- Low software overhead

�6

Goals

Page 18: SplitFS: Reducing Software Overhead in File Systems for ...vijay/papers/sosp19-splitfs-slides.pdf · SplitFS: Reducing Software Overhead in File Systems for Persistent Memory Rohan

- Low software overhead- Strong consistency guarantees

�6

Goals

Page 19: SplitFS: Reducing Software Overhead in File Systems for ...vijay/papers/sosp19-splitfs-slides.pdf · SplitFS: Reducing Software Overhead in File Systems for Persistent Memory Rohan

- Low software overhead- Strong consistency guarantees- Leverage the maturity and active

development of ext4-DAX

�6

Goals

Page 20: SplitFS: Reducing Software Overhead in File Systems for ...vijay/papers/sosp19-splitfs-slides.pdf · SplitFS: Reducing Software Overhead in File Systems for Persistent Memory Rohan

SplitFSPOSIX file system aimed at reducing software overhead for PM

�7�7

Page 21: SplitFS: Reducing Software Overhead in File Systems for ...vijay/papers/sosp19-splitfs-slides.pdf · SplitFS: Reducing Software Overhead in File Systems for Persistent Memory Rohan

SplitFSPOSIX file system aimed at reducing software overhead for PM

SplitFS serves data operations from user space and metadata operations using the ext4-DAX kernel file system

�7�7

Page 22: SplitFS: Reducing Software Overhead in File Systems for ...vijay/papers/sosp19-splitfs-slides.pdf · SplitFS: Reducing Software Overhead in File Systems for Persistent Memory Rohan

SplitFSPOSIX file system aimed at reducing software overhead for PM

SplitFS serves data operations from user space and metadata operations using the ext4-DAX kernel file system

Provides strong guarantees such as atomic and synchronous data operations

�7�7

Page 23: SplitFS: Reducing Software Overhead in File Systems for ...vijay/papers/sosp19-splitfs-slides.pdf · SplitFS: Reducing Software Overhead in File Systems for Persistent Memory Rohan

SplitFSPOSIX file system aimed at reducing software overhead for PM

SplitFS serves data operations from user space and metadata operations using the ext4-DAX kernel file system

Reduces software overhead by up to 17x compared to ext4-DAX

https://github.com/utsaslab/splitfs

Provides strong guarantees such as atomic and synchronous data operations

�7�7

Improves application throughput by up to 2x compared to NOVA

Page 24: SplitFS: Reducing Software Overhead in File Systems for ...vijay/papers/sosp19-splitfs-slides.pdf · SplitFS: Reducing Software Overhead in File Systems for Persistent Memory Rohan

Outline

• Target usage scenario • High-level design • Handling data operations • Consistency guarantees • Evaluation

�8

Page 25: SplitFS: Reducing Software Overhead in File Systems for ...vijay/papers/sosp19-splitfs-slides.pdf · SplitFS: Reducing Software Overhead in File Systems for Persistent Memory Rohan

Outline

• Target usage scenario • High-level design • Handling data operations • Consistency guarantees • Evaluation

�9

Page 26: SplitFS: Reducing Software Overhead in File Systems for ...vijay/papers/sosp19-splitfs-slides.pdf · SplitFS: Reducing Software Overhead in File Systems for Persistent Memory Rohan

Target usage scenario

�10

Page 27: SplitFS: Reducing Software Overhead in File Systems for ...vijay/papers/sosp19-splitfs-slides.pdf · SplitFS: Reducing Software Overhead in File Systems for Persistent Memory Rohan

Target usage scenario

�10

SplitFS is targeted at POSIX applications which use read() / write() system calls in order to access their data on Persistent Memory.

Page 28: SplitFS: Reducing Software Overhead in File Systems for ...vijay/papers/sosp19-splitfs-slides.pdf · SplitFS: Reducing Software Overhead in File Systems for Persistent Memory Rohan

Target usage scenario

�10

SplitFS is targeted at POSIX applications which use read() / write() system calls in order to access their data on Persistent Memory.

SplitFS does not optimize for the case when multiple processes concurrently access the same file

Page 29: SplitFS: Reducing Software Overhead in File Systems for ...vijay/papers/sosp19-splitfs-slides.pdf · SplitFS: Reducing Software Overhead in File Systems for Persistent Memory Rohan

Outline

• Target usage scenario • High-level design • Handling data operations • Consistency guarantees • Evaluation

�11

Page 30: SplitFS: Reducing Software Overhead in File Systems for ...vijay/papers/sosp19-splitfs-slides.pdf · SplitFS: Reducing Software Overhead in File Systems for Persistent Memory Rohan

High-level Design

�12

Page 31: SplitFS: Reducing Software Overhead in File Systems for ...vijay/papers/sosp19-splitfs-slides.pdf · SplitFS: Reducing Software Overhead in File Systems for Persistent Memory Rohan

High-level DesignSplitFS lies both in user space as well as in the kernel. • Data operations are served from user space • Metadata operations are served from ext4-DAX

�12

Page 32: SplitFS: Reducing Software Overhead in File Systems for ...vijay/papers/sosp19-splitfs-slides.pdf · SplitFS: Reducing Software Overhead in File Systems for Persistent Memory Rohan

High-level DesignSplitFS lies both in user space as well as in the kernel. • Data operations are served from user space • Metadata operations are served from ext4-DAX

�12

Data in kernel Metadata in kernel

Data in user Metadata in user

Page 33: SplitFS: Reducing Software Overhead in File Systems for ...vijay/papers/sosp19-splitfs-slides.pdf · SplitFS: Reducing Software Overhead in File Systems for Persistent Memory Rohan

High-level DesignSplitFS lies both in user space as well as in the kernel. • Data operations are served from user space • Metadata operations are served from ext4-DAX

ext4-DAX, PMFS [EuroSys 14],

NOVA [FAST 16]

�12

Data in kernel Metadata in kernel

Data in user Metadata in user

Page 34: SplitFS: Reducing Software Overhead in File Systems for ...vijay/papers/sosp19-splitfs-slides.pdf · SplitFS: Reducing Software Overhead in File Systems for Persistent Memory Rohan

High-level DesignSplitFS lies both in user space as well as in the kernel. • Data operations are served from user space • Metadata operations are served from ext4-DAX

ext4-DAX, PMFS [EuroSys 14],

NOVA [FAST 16]Low performance

�12

Data in kernel Metadata in kernel

Data in user Metadata in user

Page 35: SplitFS: Reducing Software Overhead in File Systems for ...vijay/papers/sosp19-splitfs-slides.pdf · SplitFS: Reducing Software Overhead in File Systems for Persistent Memory Rohan

High-level DesignSplitFS lies both in user space as well as in the kernel. • Data operations are served from user space • Metadata operations are served from ext4-DAX

ext4-DAX, PMFS [EuroSys 14],

NOVA [FAST 16]

Strata [SOSP 17]

Low performance

�12

Data in kernel Metadata in kernel

Data in user Metadata in user

Page 36: SplitFS: Reducing Software Overhead in File Systems for ...vijay/papers/sosp19-splitfs-slides.pdf · SplitFS: Reducing Software Overhead in File Systems for Persistent Memory Rohan

High-level DesignSplitFS lies both in user space as well as in the kernel. • Data operations are served from user space • Metadata operations are served from ext4-DAX

ext4-DAX, PMFS [EuroSys 14],

NOVA [FAST 16]

Strata [SOSP 17]

Low performance High complexity

�12

Data in kernel Metadata in kernel

Data in user Metadata in user

Page 37: SplitFS: Reducing Software Overhead in File Systems for ...vijay/papers/sosp19-splitfs-slides.pdf · SplitFS: Reducing Software Overhead in File Systems for Persistent Memory Rohan

High-level DesignSplitFS lies both in user space as well as in the kernel. • Data operations are served from user space • Metadata operations are served from ext4-DAX

ext4-DAX, PMFS [EuroSys 14],

NOVA [FAST 16]

Strata [SOSP 17]

Low performance High complexity

�12

Data in kernel Metadata in kernel

Data in user Metadata in user

Aerie [EuroSys 14]

High complexity

Data in user Metadata in user

Allocations in kernel

Low performance

Page 38: SplitFS: Reducing Software Overhead in File Systems for ...vijay/papers/sosp19-splitfs-slides.pdf · SplitFS: Reducing Software Overhead in File Systems for Persistent Memory Rohan

High-level DesignSplitFS lies both in user space as well as in the kernel. • Data operations are served from user space • Metadata operations are served from ext4-DAX

ext4-DAX, PMFS [EuroSys 14],

NOVA [FAST 16]

Strata [SOSP 17]

Low performance High complexity

SplitFS

Low complexityHigh performance

Data in user Metadata in kernel

�12

Data in kernel Metadata in kernel

Data in user Metadata in user

Aerie [EuroSys 14]

High complexity

Data in user Metadata in user

Allocations in kernel

Low performance

Page 39: SplitFS: Reducing Software Overhead in File Systems for ...vijay/papers/sosp19-splitfs-slides.pdf · SplitFS: Reducing Software Overhead in File Systems for Persistent Memory Rohan

High-level Design

�13

High performance

Low complexity

Page 40: SplitFS: Reducing Software Overhead in File Systems for ...vijay/papers/sosp19-splitfs-slides.pdf · SplitFS: Reducing Software Overhead in File Systems for Persistent Memory Rohan

High-level Design

�13

Accelerate data operations from user space • Data operations are common and simple

High performance

Low complexity

Page 41: SplitFS: Reducing Software Overhead in File Systems for ...vijay/papers/sosp19-splitfs-slides.pdf · SplitFS: Reducing Software Overhead in File Systems for Persistent Memory Rohan

High-level Design

�13

Accelerate data operations from user space • Data operations are common and simple

Use ext4-DAX for metadata operations • Metadata operations are rare and complex • POSIX has many complex corner-cases

High performance

Low complexity

Page 42: SplitFS: Reducing Software Overhead in File Systems for ...vijay/papers/sosp19-splitfs-slides.pdf · SplitFS: Reducing Software Overhead in File Systems for Persistent Memory Rohan

High-level Design

user

kernel

Application

File 1 File 2 File 3PM

�14

U-Split

K-Split (ext4-DAX)

Page 43: SplitFS: Reducing Software Overhead in File Systems for ...vijay/papers/sosp19-splitfs-slides.pdf · SplitFS: Reducing Software Overhead in File Systems for Persistent Memory Rohan

High-level Design

user

kernel

Application

File 1 File 2 File 3PM

read() write()

�14

U-Split

K-Split (ext4-DAX)

Page 44: SplitFS: Reducing Software Overhead in File Systems for ...vijay/papers/sosp19-splitfs-slides.pdf · SplitFS: Reducing Software Overhead in File Systems for Persistent Memory Rohan

High-level Design

creat()delete()

user

kernel

Application

File 1 File 2 File 3PM

read() write()

�14

U-Split

K-Split (ext4-DAX)

Page 45: SplitFS: Reducing Software Overhead in File Systems for ...vijay/papers/sosp19-splitfs-slides.pdf · SplitFS: Reducing Software Overhead in File Systems for Persistent Memory Rohan

High-level Design

creat()delete()

user

kernel

Application

File 1 File 2 File 3PM

read() write()

�14

Log

U-Split

K-Split (ext4-DAX)

Page 46: SplitFS: Reducing Software Overhead in File Systems for ...vijay/papers/sosp19-splitfs-slides.pdf · SplitFS: Reducing Software Overhead in File Systems for Persistent Memory Rohan

High-level Design

creat()delete()

user

kernel

Application

File 1 File 2 File 3PM

read() write()

�14

Log

U-Split

SplitFS accelerates common case data operations while leveraging the maturity of ext4-DAX for

metadata operations

K-Split (ext4-DAX)

Page 47: SplitFS: Reducing Software Overhead in File Systems for ...vijay/papers/sosp19-splitfs-slides.pdf · SplitFS: Reducing Software Overhead in File Systems for Persistent Memory Rohan

High-level Design

creat()delete()

user

kernel

Application

File 1 File 2 File 3PM

read() write()

�14

Log

U-Split

SplitFS accelerates common case data operations while leveraging the maturity of ext4-DAX for

metadata operations

K-Split (ext4-DAX)

SplitFS uses logging and out of place updates for providing atomic and synchronous operations

Page 48: SplitFS: Reducing Software Overhead in File Systems for ...vijay/papers/sosp19-splitfs-slides.pdf · SplitFS: Reducing Software Overhead in File Systems for Persistent Memory Rohan

Outline

• Target usage scenario • High-level design • Handling data operations

• Handling file reads and updates • Handling file appends

• Consistency guarantees • Evaluation

�15

Page 49: SplitFS: Reducing Software Overhead in File Systems for ...vijay/papers/sosp19-splitfs-slides.pdf · SplitFS: Reducing Software Overhead in File Systems for Persistent Memory Rohan

K-Split (ext4-DAX)

U-Split

Application

FilePM

UserKernel

Handling reads and updates

�16

Page 50: SplitFS: Reducing Software Overhead in File Systems for ...vijay/papers/sosp19-splitfs-slides.pdf · SplitFS: Reducing Software Overhead in File Systems for Persistent Memory Rohan

read / update

K-Split (ext4-DAX)

U-Split

Application

FilePM

UserKernel

Handling reads and updates

�16

Page 51: SplitFS: Reducing Software Overhead in File Systems for ...vijay/papers/sosp19-splitfs-slides.pdf · SplitFS: Reducing Software Overhead in File Systems for Persistent Memory Rohan

read / update

K-Split (ext4-DAX)

U-Split

Application

FilePM

mmap

perform mmap

UserKernel

Handling reads and updates

�16

Page 52: SplitFS: Reducing Software Overhead in File Systems for ...vijay/papers/sosp19-splitfs-slides.pdf · SplitFS: Reducing Software Overhead in File Systems for Persistent Memory Rohan

read / update

K-Split (ext4-DAX)

U-Split

Application

FilePM

DAX-mmaps

mmap

perform mmap

UserKernel

Handling reads and updates

�16

Page 53: SplitFS: Reducing Software Overhead in File Systems for ...vijay/papers/sosp19-splitfs-slides.pdf · SplitFS: Reducing Software Overhead in File Systems for Persistent Memory Rohan

read / update

K-Split (ext4-DAX)

U-Split

Application

FilePM

DAX-mmaps

UserKernel

Handling reads and updates

�16

Page 54: SplitFS: Reducing Software Overhead in File Systems for ...vijay/papers/sosp19-splitfs-slides.pdf · SplitFS: Reducing Software Overhead in File Systems for Persistent Memory Rohan

read / update

K-Split (ext4-DAX)

U-Split

Application

FilePM

DAX-mmaps

UserKernelIn the common case, file reads and updates do

not pass through the kernel

Handling reads and updates

�16

Page 55: SplitFS: Reducing Software Overhead in File Systems for ...vijay/papers/sosp19-splitfs-slides.pdf · SplitFS: Reducing Software Overhead in File Systems for Persistent Memory Rohan

Outline

• Target usage scenario • High-level design • Handling data operations

• Handling file reads and updates • Handling file appends

• Consistency guarantees • Evaluation

�17

Page 56: SplitFS: Reducing Software Overhead in File Systems for ...vijay/papers/sosp19-splitfs-slides.pdf · SplitFS: Reducing Software Overhead in File Systems for Persistent Memory Rohan

foo

size = 10foo inode

Persistent Memory

Handling appends

�18

userkernel

Page 57: SplitFS: Reducing Software Overhead in File Systems for ...vijay/papers/sosp19-splitfs-slides.pdf · SplitFS: Reducing Software Overhead in File Systems for Persistent Memory Rohan

foo

size = 10foo inode

Persistent Memory

Handling appends

�18

userkernel

Application

Start

Page 58: SplitFS: Reducing Software Overhead in File Systems for ...vijay/papers/sosp19-splitfs-slides.pdf · SplitFS: Reducing Software Overhead in File Systems for Persistent Memory Rohan

foo

size = 10foo inode

Persistent Memory

Handling appends

�18

userkernel

Application

staging file

size = 100staging file inode

Start

Page 59: SplitFS: Reducing Software Overhead in File Systems for ...vijay/papers/sosp19-splitfs-slides.pdf · SplitFS: Reducing Software Overhead in File Systems for Persistent Memory Rohan

foo

size = 10foo inode

Persistent Memory

Handling appends

�18

userkernel

Application

staging file

size = 100staging file inode

Startstaging file mmap

Page 60: SplitFS: Reducing Software Overhead in File Systems for ...vijay/papers/sosp19-splitfs-slides.pdf · SplitFS: Reducing Software Overhead in File Systems for Persistent Memory Rohan

foo

size = 10foo inode

Persistent Memory

Handling appends

�18

userkernel

Application

staging file

size = 100staging file inode

Startstaging file mmapappend (foo,“abc”)

abc

store

Page 61: SplitFS: Reducing Software Overhead in File Systems for ...vijay/papers/sosp19-splitfs-slides.pdf · SplitFS: Reducing Software Overhead in File Systems for Persistent Memory Rohan

foo

size = 10foo inode

Persistent Memory

Handling appends

�18

userkernel

Application

staging file

size = 100staging file inode

Startstaging file mmapappend (foo,“abc”)

abc

read (foo)load

Page 62: SplitFS: Reducing Software Overhead in File Systems for ...vijay/papers/sosp19-splitfs-slides.pdf · SplitFS: Reducing Software Overhead in File Systems for Persistent Memory Rohan

foo

size = 10foo inode

Persistent Memory

Handling appends

�18

userkernel

Application

staging file

size = 100staging file inode

Startstaging file mmapappend (foo,“abc”)

abc

read (foo)

fsync (foo)

Page 63: SplitFS: Reducing Software Overhead in File Systems for ...vijay/papers/sosp19-splitfs-slides.pdf · SplitFS: Reducing Software Overhead in File Systems for Persistent Memory Rohan

foo

size = 10foo inode

Persistent Memory

Handling appends

�18

userkernel

Application

staging file

size = 100staging file inode

Startstaging file mmapappend (foo,“abc”)

abc

read (foo)

fsync (foo)relink()

Page 64: SplitFS: Reducing Software Overhead in File Systems for ...vijay/papers/sosp19-splitfs-slides.pdf · SplitFS: Reducing Software Overhead in File Systems for Persistent Memory Rohan

foo

size = 10foo inode

Persistent Memory

Handling appends

�18

userkernel

Application

staging file

size = 100staging file inode

Startstaging file mmapappend (foo,“abc”)

abc

read (foo)

fsync (foo)foo staging

ext4-journal transaction

relink()

Page 65: SplitFS: Reducing Software Overhead in File Systems for ...vijay/papers/sosp19-splitfs-slides.pdf · SplitFS: Reducing Software Overhead in File Systems for Persistent Memory Rohan

foo

size = 10foo inode

Persistent Memory

Handling appends

�18

userkernel

Application

staging file

size = 100staging file inode

Startstaging file mmapappend (foo,“abc”)

read (foo)

fsync (foo)foo staging

ext4-journal transaction

abc

relink()

Page 66: SplitFS: Reducing Software Overhead in File Systems for ...vijay/papers/sosp19-splitfs-slides.pdf · SplitFS: Reducing Software Overhead in File Systems for Persistent Memory Rohan

foo

size = 10foo inode

Persistent Memory

Handling appends

�18

userkernel

Application

staging file

size = 100staging file inode

Startstaging file mmapappend (foo,“abc”)

read (foo)

fsync (foo)foo staging

ext4-journal transaction

abc

In the common case, file appends do not pass through the kernel

relink()

Page 67: SplitFS: Reducing Software Overhead in File Systems for ...vijay/papers/sosp19-splitfs-slides.pdf · SplitFS: Reducing Software Overhead in File Systems for Persistent Memory Rohan

Outline

• Target usage scenario • High-level design • Handling data operations • Consistency guarantees • Evaluation

�19

Page 68: SplitFS: Reducing Software Overhead in File Systems for ...vijay/papers/sosp19-splitfs-slides.pdf · SplitFS: Reducing Software Overhead in File Systems for Persistent Memory Rohan

Consistency Guarantees

Mode Metadata Atomicity

Synchronous Operations

Data Atomicity File System

POSIX ext4-DAX, SplitFS-POSIX

Sync PMFS, SplitFS-Sync

Strict NOVA, Strata,SplitFS-Strict

�20

Page 69: SplitFS: Reducing Software Overhead in File Systems for ...vijay/papers/sosp19-splitfs-slides.pdf · SplitFS: Reducing Software Overhead in File Systems for Persistent Memory Rohan

Consistency Guarantees

Mode Metadata Atomicity

Synchronous Operations

Data Atomicity File System

POSIX ext4-DAX, SplitFS-POSIX

Sync PMFS, SplitFS-Sync

Strict NOVA, Strata,SplitFS-Strict

�20

Page 70: SplitFS: Reducing Software Overhead in File Systems for ...vijay/papers/sosp19-splitfs-slides.pdf · SplitFS: Reducing Software Overhead in File Systems for Persistent Memory Rohan

Consistency Guarantees

Mode Metadata Atomicity

Synchronous Operations

Data Atomicity File System

POSIX ext4-DAX, SplitFS-POSIX

Sync PMFS, SplitFS-Sync

Strict NOVA, Strata,SplitFS-Strict

�20

Page 71: SplitFS: Reducing Software Overhead in File Systems for ...vijay/papers/sosp19-splitfs-slides.pdf · SplitFS: Reducing Software Overhead in File Systems for Persistent Memory Rohan

Consistency Guarantees

Mode Metadata Atomicity

Synchronous Operations

Data Atomicity File System

POSIX ext4-DAX, SplitFS-POSIX

Sync PMFS, SplitFS-Sync

Strict NOVA, Strata,SplitFS-Strict

�20

Page 72: SplitFS: Reducing Software Overhead in File Systems for ...vijay/papers/sosp19-splitfs-slides.pdf · SplitFS: Reducing Software Overhead in File Systems for Persistent Memory Rohan

Consistency Guarantees

Mode Metadata Atomicity

Synchronous Operations

Data Atomicity File System

POSIX ext4-DAX, SplitFS-POSIX

Sync PMFS, SplitFS-Sync

Strict NOVA, Strata,SplitFS-Strict

Optimized logging is used in order to provide stronger guarantees in sync and strict modes

�20

Page 73: SplitFS: Reducing Software Overhead in File Systems for ...vijay/papers/sosp19-splitfs-slides.pdf · SplitFS: Reducing Software Overhead in File Systems for Persistent Memory Rohan

Optimized logging

�21

Page 74: SplitFS: Reducing Software Overhead in File Systems for ...vijay/papers/sosp19-splitfs-slides.pdf · SplitFS: Reducing Software Overhead in File Systems for Persistent Memory Rohan

SplitFS employs a per-application log in sync and strict mode, which logs every logical operation

Optimized logging

�21

Page 75: SplitFS: Reducing Software Overhead in File Systems for ...vijay/papers/sosp19-splitfs-slides.pdf · SplitFS: Reducing Software Overhead in File Systems for Persistent Memory Rohan

SplitFS employs a per-application log in sync and strict mode, which logs every logical operation

In the common case • Each log entry fits in one cache line • Persisted using a single non-temporal store and sfence instruction

Optimized logging

�21

Page 76: SplitFS: Reducing Software Overhead in File Systems for ...vijay/papers/sosp19-splitfs-slides.pdf · SplitFS: Reducing Software Overhead in File Systems for Persistent Memory Rohan

Flexible SplitFS

K-Split (ext4-DAX)

App 1

File 1PM

App 2 App 3

File 2 File 3 File 4

UserKernel

�22

Page 77: SplitFS: Reducing Software Overhead in File Systems for ...vijay/papers/sosp19-splitfs-slides.pdf · SplitFS: Reducing Software Overhead in File Systems for Persistent Memory Rohan

Flexible SplitFS

K-Split (ext4-DAX)

App 1

File 1PM

App 2 App 3

U-Split-POSIX

U-Split-sync

U-Split-strict

File 2 File 3 File 4

UserKernel

�22

Page 78: SplitFS: Reducing Software Overhead in File Systems for ...vijay/papers/sosp19-splitfs-slides.pdf · SplitFS: Reducing Software Overhead in File Systems for Persistent Memory Rohan

Flexible SplitFS

K-Split (ext4-DAX)

App 1

File 1PM

App 2 App 3

U-Split-POSIX

U-Split-sync

U-Split-strict

File 2 File 3 File 4

Data Data DataMeta Meta Meta

UserKernel

�22

Page 79: SplitFS: Reducing Software Overhead in File Systems for ...vijay/papers/sosp19-splitfs-slides.pdf · SplitFS: Reducing Software Overhead in File Systems for Persistent Memory Rohan

�23

Visibility

Page 80: SplitFS: Reducing Software Overhead in File Systems for ...vijay/papers/sosp19-splitfs-slides.pdf · SplitFS: Reducing Software Overhead in File Systems for Persistent Memory Rohan

�23

Visibility

When are updates from one application visible to another?

Page 81: SplitFS: Reducing Software Overhead in File Systems for ...vijay/papers/sosp19-splitfs-slides.pdf · SplitFS: Reducing Software Overhead in File Systems for Persistent Memory Rohan

�23

• All metadata operations are immediately visible to all other processes

Visibility

When are updates from one application visible to another?

Page 82: SplitFS: Reducing Software Overhead in File Systems for ...vijay/papers/sosp19-splitfs-slides.pdf · SplitFS: Reducing Software Overhead in File Systems for Persistent Memory Rohan

�23

• All metadata operations are immediately visible to all other processes

Visibility

• Writes are visible to all other processes on subsequent fsync()

When are updates from one application visible to another?

Page 83: SplitFS: Reducing Software Overhead in File Systems for ...vijay/papers/sosp19-splitfs-slides.pdf · SplitFS: Reducing Software Overhead in File Systems for Persistent Memory Rohan

�23

• All metadata operations are immediately visible to all other processes

Visibility

• Writes are visible to all other processes on subsequent fsync()

• Memory mapped files have the same visibility guarantees as that of ext4-DAX

When are updates from one application visible to another?

Page 84: SplitFS: Reducing Software Overhead in File Systems for ...vijay/papers/sosp19-splitfs-slides.pdf · SplitFS: Reducing Software Overhead in File Systems for Persistent Memory Rohan

SplitFS Techniques

Technique Benefit

�24

Page 85: SplitFS: Reducing Software Overhead in File Systems for ...vijay/papers/sosp19-splitfs-slides.pdf · SplitFS: Reducing Software Overhead in File Systems for Persistent Memory Rohan

SplitFS Techniques

Technique Benefit

SplitFS Architecture Low-overhead data operations, Correct metadata operations

�24

Page 86: SplitFS: Reducing Software Overhead in File Systems for ...vijay/papers/sosp19-splitfs-slides.pdf · SplitFS: Reducing Software Overhead in File Systems for Persistent Memory Rohan

SplitFS Techniques

Technique Benefit

SplitFS Architecture Low-overhead data operations, Correct metadata operations

Staging + Relink Optimized appends, No data copy

�24

Page 87: SplitFS: Reducing Software Overhead in File Systems for ...vijay/papers/sosp19-splitfs-slides.pdf · SplitFS: Reducing Software Overhead in File Systems for Persistent Memory Rohan

SplitFS Techniques

Technique Benefit

SplitFS Architecture Low-overhead data operations, Correct metadata operations

Staging + Relink Optimized appends, No data copy

Optimized Logging + out-of-place writes Stronger guarantees

�24

Page 88: SplitFS: Reducing Software Overhead in File Systems for ...vijay/papers/sosp19-splitfs-slides.pdf · SplitFS: Reducing Software Overhead in File Systems for Persistent Memory Rohan

Outline

• Target usage scenario • High-level design • Handling data operations • Consistency guarantees • Evaluation

�25

Page 89: SplitFS: Reducing Software Overhead in File Systems for ...vijay/papers/sosp19-splitfs-slides.pdf · SplitFS: Reducing Software Overhead in File Systems for Persistent Memory Rohan

Evaluation

�26

Page 90: SplitFS: Reducing Software Overhead in File Systems for ...vijay/papers/sosp19-splitfs-slides.pdf · SplitFS: Reducing Software Overhead in File Systems for Persistent Memory Rohan

Evaluation

Setup: • 2-socket 96-core machine with 32 MB LLC • 768 GB Intel Optane DC PMM, 378 GB DRAM

�26

Page 91: SplitFS: Reducing Software Overhead in File Systems for ...vijay/papers/sosp19-splitfs-slides.pdf · SplitFS: Reducing Software Overhead in File Systems for Persistent Memory Rohan

Evaluation

Setup: • 2-socket 96-core machine with 32 MB LLC • 768 GB Intel Optane DC PMM, 378 GB DRAM

File systems compared: • ext4-DAX, PMFS, NOVA, Strata

�26

Page 92: SplitFS: Reducing Software Overhead in File Systems for ...vijay/papers/sosp19-splitfs-slides.pdf · SplitFS: Reducing Software Overhead in File Systems for Persistent Memory Rohan

Evaluation

Setup: • 2-socket 96-core machine with 32 MB LLC • 768 GB Intel Optane DC PMM, 378 GB DRAM

�26

File systems compared: • ext4-DAX, PMFS, NOVA, Strata

Page 93: SplitFS: Reducing Software Overhead in File Systems for ...vijay/papers/sosp19-splitfs-slides.pdf · SplitFS: Reducing Software Overhead in File Systems for Persistent Memory Rohan

Does SplitFS reduce software overhead compared to other file systems?

How does SplitFS perform on metadata intensive workloads?

How does SplitFS perform on data intensive workloads?

�27

Page 94: SplitFS: Reducing Software Overhead in File Systems for ...vijay/papers/sosp19-splitfs-slides.pdf · SplitFS: Reducing Software Overhead in File Systems for Persistent Memory Rohan

Does SplitFS reduce software overhead compared to other file systems?

How does SplitFS perform on metadata intensive workloads?

How does SplitFS perform on data intensive workloads?

�27

• < 15% overhead for metadata intensive workloads

Page 95: SplitFS: Reducing Software Overhead in File Systems for ...vijay/papers/sosp19-splitfs-slides.pdf · SplitFS: Reducing Software Overhead in File Systems for Persistent Memory Rohan

• Append 4KB data to a file • Time taken to copy user data to PM: ~700 ns

Software Overhead of SplitFS

0

2000

4000

6000

8000

10000

device SplitFS-strict Strata NOVA PMFS ext4-DAX

4150 (5x)

3021 (3x)

9002 (12x)

700

2450 (2.5x)

Tim

e (n

s)

�28

Page 96: SplitFS: Reducing Software Overhead in File Systems for ...vijay/papers/sosp19-splitfs-slides.pdf · SplitFS: Reducing Software Overhead in File Systems for Persistent Memory Rohan

• Append 4KB data to a file • Time taken to copy user data to PM: ~700 ns

Software Overhead of SplitFSTi

me

(ns)

�28

0

2000

4000

6000

8000

10000

device SplitFS-strict Strata NOVA PMFS ext4-DAX

1251 (0.8x)

4150 (5x)

3021 (3x)

700

2450 (2.5x)

9002 (12x)

Page 97: SplitFS: Reducing Software Overhead in File Systems for ...vijay/papers/sosp19-splitfs-slides.pdf · SplitFS: Reducing Software Overhead in File Systems for Persistent Memory Rohan

Workloads

YCSB on LevelDB

Seq reads

Redis

Tar Git Rsync

Microbenchmarks

Data intensive

Metadata intensive

TPCC on SQLite

Rand reads

Seq writes

Rand writesAppends

�29

Page 98: SplitFS: Reducing Software Overhead in File Systems for ...vijay/papers/sosp19-splitfs-slides.pdf · SplitFS: Reducing Software Overhead in File Systems for Persistent Memory Rohan

Workloads

YCSB on LevelDB

Seq reads

Redis

Tar Git Rsync

Microbenchmarks

Data intensive

Metadata intensive

TPCC on SQLite

Rand reads

Seq writes

Rand writesAppends

�29

Page 99: SplitFS: Reducing Software Overhead in File Systems for ...vijay/papers/sosp19-splitfs-slides.pdf · SplitFS: Reducing Software Overhead in File Systems for Persistent Memory Rohan

YCSB on LevelDB

Load A - 100% writesRun A - 50% reads, 50% writesRun B - 95% reads, 5% writesRun C - 100% reads

Run D - 95% reads (latest), 5% writesLoad E - 100% writesRun E - 95% range queries, 5% writesRun F - 50% reads, 50% read-modify-writes

Yahoo! Cloud Serving Benchmark - Industry standard macro-benchmarkInsert 5M keys. Run 5M operations. Key size = 16 bytes. Value size = 1K

0

0.5

1

1.5

2

2.5

Load A Run A Run B Run C Run D Load E Run E Run F

NOVASplitFS-Strict

Nor

mal

ized

thro

ughp

ut

�30

Page 100: SplitFS: Reducing Software Overhead in File Systems for ...vijay/papers/sosp19-splitfs-slides.pdf · SplitFS: Reducing Software Overhead in File Systems for Persistent Memory Rohan

0

0.5

1

1.5

2

2.5

Load A Run A Run B Run C Run D Load E Run E Run F

NOVASplitFS-Strict

Load A - 100% writesRun A - 50% reads, 50% writesRun B - 95% reads, 5% writesRun C - 100% reads

Run D - 95% reads (latest), 5% writesLoad E - 100% writesRun E - 95% range queries, 5% writesRun F - 50% reads, 50% read-modify-writes

13.3

9 ko

ps/s

32.2

4 ko

ps/s

139.

94 k

ops/

s

174.

85 k

ops/

s

191.

54 k

ops/

s

13.5

9 ko

ps/s

17.7

5 ko

ps/s

66.5

4 ko

ps/s

Yahoo! Cloud Serving Benchmark - Industry standard macro-benchmarkInsert 5M keys. Run 5M operations. Key size = 16 bytes. Value size = 1K

YCSB on LevelDBN

orm

aliz

ed th

roug

hput

�31

Page 101: SplitFS: Reducing Software Overhead in File Systems for ...vijay/papers/sosp19-splitfs-slides.pdf · SplitFS: Reducing Software Overhead in File Systems for Persistent Memory Rohan

Load A - 100% writesRun A - 50% reads, 50% writesRun B - 95% reads, 5% writesRun C - 100% reads

Run D - 95% reads (latest), 5% writesLoad E - 100% writesRun E - 95% range queries, 5% writesRun F - 50% reads, 50% read-modify-writes

Yahoo! Cloud Serving Benchmark - Industry standard macro-benchmarkInsert 5M keys. Run 5M operations. Key size = 16 bytes. Value size = 1K

0

0.5

1

1.5

2

2.5

Load A Run A Run B Run C Run D Load E Run E Run F

NOVASplitFS-Strict

13.3

9 ko

ps/s

32.2

4 ko

ps/s

139.

94 k

ops/

s

174.

85 k

ops/

s

191.

54 k

ops/

s

13.5

9 ko

ps/s

17.7

5 ko

ps/s

66.5

4 ko

ps/s

YCSB on LevelDBN

orm

aliz

ed th

roug

hput

�32

Page 102: SplitFS: Reducing Software Overhead in File Systems for ...vijay/papers/sosp19-splitfs-slides.pdf · SplitFS: Reducing Software Overhead in File Systems for Persistent Memory Rohan

Load A - 100% writesRun A - 50% reads, 50% writesRun B - 95% reads, 5% writesRun C - 100% reads

Run D - 95% reads (latest), 5% writesLoad E - 100% writesRun E - 95% range queries, 5% writesRun F - 50% reads, 50% read-modify-writes

Yahoo! Cloud Serving Benchmark - Industry standard macro-benchmarkInsert 5M keys. Run 5M operations. Key size = 16 bytes. Value size = 1K

Read-heavy workloads optimized because of converting reads to loads

0

0.5

1

1.5

2

2.5

Load A Run A Run B Run C Run D Load E Run E Run F

NOVASplitFS-Strict

13.3

9 ko

ps/s

32.2

4 ko

ps/s

139.

94 k

ops/

s

174.

85 k

ops/

s

191.

54 k

ops/

s

13.5

9 ko

ps/s

17.7

5 ko

ps/s

66.5

4 ko

ps/s

YCSB on LevelDBN

orm

aliz

ed th

roug

hput

�32

Page 103: SplitFS: Reducing Software Overhead in File Systems for ...vijay/papers/sosp19-splitfs-slides.pdf · SplitFS: Reducing Software Overhead in File Systems for Persistent Memory Rohan

Load A - 100% writesRun A - 50% reads, 50% writesRun B - 95% reads, 5% writesRun C - 100% reads

Run D - 95% reads (latest), 5% writesLoad E - 100% writesRun E - 95% range queries, 5% writesRun F - 50% reads, 50% read-modify-writes

Yahoo! Cloud Serving Benchmark - Industry standard macro-benchmarkInsert 5M keys. Run 5M operations. Key size = 16 bytes. Value size = 1K

0

0.5

1

1.5

2

2.5

Load A Run A Run B Run C Run D Load E Run E Run F

NOVASplitFS-Strict

13.3

9 ko

ps/s

32.2

4 ko

ps/s

139.

94 k

ops/

s

174.

85 k

ops/

s

191.

54 k

ops/

s

13.5

9 ko

ps/s

17.7

5 ko

ps/s

66.5

4 ko

ps/s

YCSB on LevelDBN

orm

aliz

ed th

roug

hput

�33

Page 104: SplitFS: Reducing Software Overhead in File Systems for ...vijay/papers/sosp19-splitfs-slides.pdf · SplitFS: Reducing Software Overhead in File Systems for Persistent Memory Rohan

Load A - 100% writesRun A - 50% reads, 50% writesRun B - 95% reads, 5% writesRun C - 100% reads

Run D - 95% reads (latest), 5% writesLoad E - 100% writesRun E - 95% range queries, 5% writesRun F - 50% reads, 50% read-modify-writes

Yahoo! Cloud Serving Benchmark - Industry standard macro-benchmarkInsert 5M keys. Run 5M operations. Key size = 16 bytes. Value size = 1K

Write-heavy workloads optimized because of staging and relink

0

0.5

1

1.5

2

2.5

Load A Run A Run B Run C Run D Load E Run E Run F

NOVASplitFS-Strict

13.3

9 ko

ps/s

32.2

4 ko

ps/s

139.

94 k

ops/

s

174.

85 k

ops/

s

191.

54 k

ops/

s

13.5

9 ko

ps/s

17.7

5 ko

ps/s

66.5

4 ko

ps/s

YCSB on LevelDBN

orm

aliz

ed th

roug

hput

�33

Page 105: SplitFS: Reducing Software Overhead in File Systems for ...vijay/papers/sosp19-splitfs-slides.pdf · SplitFS: Reducing Software Overhead in File Systems for Persistent Memory Rohan

SplitFS

�34

Page 106: SplitFS: Reducing Software Overhead in File Systems for ...vijay/papers/sosp19-splitfs-slides.pdf · SplitFS: Reducing Software Overhead in File Systems for Persistent Memory Rohan

SplitFSSplitFS introduces a new architecture for building PM file systems that… reduces software overhead, provides strong guarantees, and leverages the widely-used ext4-DAX

�34

Page 107: SplitFS: Reducing Software Overhead in File Systems for ...vijay/papers/sosp19-splitfs-slides.pdf · SplitFS: Reducing Software Overhead in File Systems for Persistent Memory Rohan

SplitFSSplitFS introduces a new architecture for building PM file systems that… reduces software overhead, provides strong guarantees, and leverages the widely-used ext4-DAX

https://github.com/utsaslab/splitfs�34

Page 108: SplitFS: Reducing Software Overhead in File Systems for ...vijay/papers/sosp19-splitfs-slides.pdf · SplitFS: Reducing Software Overhead in File Systems for Persistent Memory Rohan

Backup Slides

�35

Page 109: SplitFS: Reducing Software Overhead in File Systems for ...vijay/papers/sosp19-splitfs-slides.pdf · SplitFS: Reducing Software Overhead in File Systems for Persistent Memory Rohan

Load A - 100% writesRun A - 50% reads, 50% writesRun B - 95% reads, 5% writesRun C - 100% reads

Run D - 95% reads (latest), 5% writesLoad E - 100% writesRun E - 95% range queries, 5% writesRun F - 50% reads, 50% read-modify-writes

Yahoo! Cloud Serving Benchmark - Industry standard macro-benchmarkInsert 5M keys. Run 1M operations. Key size = 16 bytes. Value size = 1K

YCSB on LevelDBN

orm

aliz

ed th

roug

hput

�36

0

0.5

1

1.5

2

2.5

Load A Run A Run B Run C Run D Load E Run E Run F

StrataSplitFS-Strict