27
Fusion-io Confidential—Copyright © 2013 Fusion-io, Inc. All rights reserved. Copyright © 2013 Fusion-io, Inc. All rights reserved. NVM Software Interfaces New Directions Nisha Talagala

NVM Software Interfaces New Directions - Home | Storage Networking

  • Upload
    others

  • View
    4

  • Download
    0

Embed Size (px)

Citation preview

Page 1: NVM Software Interfaces New Directions - Home | Storage Networking

Fusion-io Confidential—Copyright © 2013 Fusion-io, Inc. All rights reserved. Copyright © 2013 Fusion-io, Inc. All rights reserved.

NVM Software Interfaces New Directions

Nisha Talagala

Page 2: NVM Software Interfaces New Directions - Home | Storage Networking

Evolution of Flash Adoption

February 4, 2013 SNIA NVM Summit 2

F L A S H + D I S K

Page 3: NVM Software Interfaces New Directions - Home | Storage Networking

Evolution of Flash Adoption

February 4, 2013 SNIA NVM Summit 3

F L A S H + D I S K

F L A S H A S D I S K

Page 4: NVM Software Interfaces New Directions - Home | Storage Networking

Evolution of Flash Adoption

February 4, 2013 SNIA NVM Summit 4

F L A S H A S M E M O R Y

F L A S H + D I S K

F L A S H A S D I S K

Page 5: NVM Software Interfaces New Directions - Home | Storage Networking

Evolution of Flash Adoption

February 4, 2013 SNIA NVM Summit 5

Page 6: NVM Software Interfaces New Directions - Home | Storage Networking

Conventional I/O Access

February 4, 2013 SNIA NVM Summit 6

APPLICATION

Application source code

Simple Block

Network File

Simple Block

Traditional Storage

Proprietary Storage OS

Storage Media

Native Flash Translation Layer

Storage Media

Software Defined Storage

Conventional I/O access

Page 7: NVM Software Interfaces New Directions - Home | Storage Networking

Direct-Access I/O through Native Interfaces

February 4, 2013 SNIA NVM Summit 7

APPLICATION

Application source code

Transactional Block

Native File

Key-Value Object

Simple Block

Network File

Simple Block

Traditional Storage

Proprietary Storage OS

Storage Media

Native Flash Translation Layer

Storage Media

Software Defined Storage

Conventional I/O access

Direct access I/O

Page 8: NVM Software Interfaces New Directions - Home | Storage Networking

Memory-access through Native Interfaces

February 4, 2013 SNIA NVM Summit 8

APPLICATION

Application source code

High-speed Log

Checkpointed Memory

Auto-Commit

Memory™ Transactional Block

Native File

Key-Value Object

Simple Block

Network File

Simple Block

Traditional Storage

Proprietary Storage OS

Storage Media

Native Flash Translation Layer

Storage Media

Software Defined Storage

Conventional I/O access

Memory access

Direct access I/O

Page 9: NVM Software Interfaces New Directions - Home | Storage Networking

New Primitives for a New Type of Media

February 4, 2013 SNIA NVM Summit 9

Open, read, write, rewind, close.

Open, read, write, seek, close. Open, read, write, seek, close. Open, read, write, seek, close. Plus, new primitives to exploit characteristics of non-volatile memory

Basic write + atomic write, conditional write. Basic write + TTL expiry for auto-deletion. Basic mmap + crash-safety, versioning.

Tape

Disk

ioMemory NVM

SSD

Page 10: NVM Software Interfaces New Directions - Home | Storage Networking

ATOMIC I/O Primitives: Sample Uses and Benefits

February 4, 2013 SNIA NVM Summit 10

Databases Transactional Atomicity: Replace various workarounds implemented in database code to provide write atomicity (double-buffered writes, etc.) Filesystems File Update Atomicity: Replace various workarounds implemented in filesystem code to provide file/directory update atomicity (journaling, etc.)

▸ 99% performance of raw writes Smarter media now natively understands atomic updates, with no additional metadata overhead.

▸ 2x longer flash media life Atomic Writes increase the life of flash media up to 2x due to reduction in write-ahead-logging and double-write buffering.

▸ 50% less code in key modules Atomic operations dramatically reduce application logic, such as journaling, built as work-arounds.

Page 11: NVM Software Interfaces New Directions - Home | Storage Networking

MySQL with Atomic Writes

Double-write disabled – Non-ACID

Double-write – ACID

XFS directFS with Atomic I/O

Twice the performance of ACID transactions

Atomic writes – ACID

Fusion-io Confidential 11

Page 12: NVM Software Interfaces New Directions - Home | Storage Networking

Native File System Access

February 4, 2013 SNIA NVM Summit 12

▸ Leverages native flash capabilities for file system acceleration

▸ File services layer ▸ Consumes and uses native

primitives ▸ Exports primitives for use by

applications ▸ Applications can run on either

devices or filesystems

ioMemory™ with direct access I/O

ioMemory™ with memory semantics

Application Application

User-defined I/O API Libraries

User-defined Memory API Libraries

Direct-access I/O API Libraries

Memory Semantics API Libraries

directFS – NVM filesystem

directFS – NVM filesystem

I/O Primitives Memory Primitives

VSL™ VSL™

Read/Write Read/Write CPU Load/Store

Native Access

Page 13: NVM Software Interfaces New Directions - Home | Storage Networking

directFS

February 4, 2013 SNIA NVM Summit 13

▸ Appears as any other file system in Linux ▸ Applications can use directFS file system unmodified

with performance benefits ▸ Focuses only on file namespace ▸ Employs virtualized flash storage layer’s logic for:

• Large virtualized addressed space • Direct flash access • Crash recovery mechanisms

▸ Exports Primitives through file namespace ▸ Applications can use primitives through directFS or directly to

device

Page 14: NVM Software Interfaces New Directions - Home | Storage Networking

directFS: Native File Name Space for NVM

February 4, 2013 SNIA NVM Summit 14

Application

VFS (virtual file system) abstraction layer

ioMemory VSL– Dynamic provisioning, Block allocation, logging etc.

ext3 btrfs xfs directFS

Kernel block layer

kernel-space

user-space

Page 15: NVM Software Interfaces New Directions - Home | Storage Networking

directFS – Eliminating duplicate logic

February 4, 2013 SNIA NVM Summit 15

Application

VFS (virtual file system) abstraction layer

ioMemory VSL block allocation, mapping, recycling

ACID updates, logging/journaling, crash-recovery,

directFS file metadata mgmt

Kernel block layer

Block Availability:

exists()

ACID Updates:

atomic write()

Block Grooming: PTRIM()

kernel-space

user-space

FS file metadata mgmt,

block allocation, mapping, recycling, ACID updates, logging/journaling, crash-recovery,

Page 16: NVM Software Interfaces New Directions - Home | Storage Networking

Performance PARITY BETWEEN BLOCK AND DIRECTFS: micro-benchmarks

February 4, 2013 16

2 Thread

I/O Size

Seq Read, Block

Seq Read, directFS

Rand Read, Block

Rand Read, directFS

Seq Write, Block

Seq Write, directFS

Rand Write, Block

Rand Write, directFS

Ban

dwid

th (M

iB/s

)

Page 17: NVM Software Interfaces New Directions - Home | Storage Networking

Performance PARITY BETWEEN BLOCK AND DIRECTFS: atomic writes

February 4, 2013 17

Ban

dwid

th (M

iB/s

)

I/O Size B

andw

idth

(MiB

/s)

I/O Size Block directFS

1 Thread 8 Threads

Page 18: NVM Software Interfaces New Directions - Home | Storage Networking

MySQL on directFS with Atomic Writes

Double-write disabled – Non-ACID

Double-write – ACID

XFS directFS with Atomic I/O

Twice the performance of ACID transactions

Atomic writes – ACID

Fusion-io Confidential 18

Page 19: NVM Software Interfaces New Directions - Home | Storage Networking

Key-value store API Library: Sample Uses and Benefits

February 4, 2013 SNIA NVM Summit 19

NoSQL Applications Reduce overprovisioning due to lack of coordination between two-layers of garbage collection (application-layer and flash-layer). Some top NoSQL applications recommend over-provisioning by 3x due to this. Reduce application I/Os through batched put and get operations. Increase performance by eliminating packing and unpacking blocks, defragmentation, and duplicate metadata at app layer.

▸ 95% performance of raw device Smarter media now natively understands a key-value I/O interface with lock-free updates, crash recovery, and no additional metadata overhead.

▸ Up to 3x capacity increase

Dramatically reduces over-provisioning with coordinated garbage collection and automated key expiry.

▸ 3x throughput on same SSD Early benchmarks comparing against memcached with BerkeleyDB persistence show up to 3x improvement.

Page 20: NVM Software Interfaces New Directions - Home | Storage Networking

Key-Value Store API library Benchmarks

February 4, 2013 SNIA NVM Summit 20

Page 21: NVM Software Interfaces New Directions - Home | Storage Networking

Range of memory-Access Semantics

February 4, 2013 SNIA NVM Summit 21

Extended Memory Volatile

Transparently extends DRAM onto flash, extending application virtual memory

Checkpointed Memory

Volatile with non-volatile checkpoints

Region of application virtual memory with ability to preserve snapshots to flash namespace

Auto-Commit Memory™ Non-volatile

Region of application memory automatically persisted to non-volatile memory and recoverable post-system failure

Page 22: NVM Software Interfaces New Directions - Home | Storage Networking

OS Swap vs. Extended Memory

February 4, 2013 SNIA NVM Summit 22

Originally designed as a last resort to prevent OOM (out-of-memory) failures Never tuned for high-performance demand-paging Never tuned for multi-threaded apps Poor performance, ex. < 30 MB/sec throughput No application code changes required Designed to migrate hot pages to DRAM and cold pages to ioMemory Tuned to run natively on flash (leverages native characteristics) Tuned for multi-threaded apps 10-15x throughput improvement over standard OS Swap

Non-Volatile Storage (Disks, SSDs, etc.)

System Memory

OS SWAP Mechanism

ioMemory

System Memory

Extended Memory Mechanism

Page 23: NVM Software Interfaces New Directions - Home | Storage Networking

Comparing I/O and Memory Access Semantics

February 4, 2013 SNIA NVM Summit 23

I/O I/O semantics examples:

• Open file descriptor – open(), read(), write(), seek(), close() • (New) Write multiple data blocks atomically, nvm_vectored_write() • (New) Open key-value store – nvm_kv_open(), kv_put(), kv_get(), kv_batch_*()

Memory Access

(Volatile)

Volatile memory semantics example: • Allocate virtual memory, e.g. malloc() • memcpy/pointer dereference writes (or reads) to memory address • (Improved) Page-faulting transparently loads data from NVM into memory

Memory Access

(Non-

Volatile)

Non-volatile memory semantics example: • (New) Allocate and map Auto-Commit Memory™ (ACM) virtual memory pages • memcpy/pointer dereference writes (or reads) to memory address • (New) Call checkpoint() to create application-consistent ACM page snapshots • (New) After system failure, remap ACM snapshot pages to recover memory state • (New) De-stage completed ACM pages to NVM namespace • (New) Remap and access ACM pages from NVM namespace at any time

Page 24: NVM Software Interfaces New Directions - Home | Storage Networking

Application Use of Memory-Access Semantics

February 4, 2013 SNIA NVM Summit 24

Legacy SSDs ioMemory as ioDrive

ioMemory as Transparent Cache

ioMemory with direct access I/O

ioMemory with memory semantics

Appl

icat

ion

Application

Appl

icat

ion

Application Application Application Application

Open Source Extensions

Open Source Extensions

OS Block I/O OS Block I/O OS Block I/O

dire

ct I

/O

Prim

itive

s

dire

ct

Key-

Valu

e St

ore

API

dire

ct C

ache

AP

I

Exte

nded

M

emor

y

Chec

k-po

inte

d M

emor

y

Auto

-Com

mit

Mem

ory

Host

Host

File System File System File System directFS – native file

system service

directFS

VSL

Block Layer Block Layer Block Layer

SAS/SATA

Network VSL Virtual Storage

Layer

directCache VSL VSL

Rem

ote RAID Controller

VSL Flash Layer Read/Write Read/Write Read/Write Read/Write Read/Write Load/Store

1. Application source uses memory programming semantics, such as malloc(), free(), pointer ops, etc.

2. Stack can exhibit different properties ranging from purely volatile (DRAM extension), to intermediate points of persistence (checkpoints), to fine grained persistence (ACM)

3. Underlying technology can be block oriented or support direct CPU load/store operations

4. Integrates with existing storage namespaces

Page 25: NVM Software Interfaces New Directions - Home | Storage Networking

Open Interfaces and Open Source

February 4, 2013 SNIA NVM Summit 25

• Primitives: Open Interface

• directFS: Open Source

• API Libraries: Open Source, Open Interface

• Application modifications: Open Source

• INCITS SCSI (T10) active standards proposals: ▸ SBC-4 SPC-5 Atomic-Write

http://www.t10.org/cgi-bin/ac.pl?t=d&f=11-229r6.pdf

▸ SBC-4 SPC-5 Scattered writes, optionally atomic http://www.t10.org/cgi-bin/ac.pl?t=d&f=12-086r3.pdf

▸ SBC-4 SPC-5 Gathered reads, optionally atomic http://www.t10.org/cgi-bin/ac.pl?t=d&f=12-087r3.pdf

• SNIA NVM-Programming TWG active member

Page 26: NVM Software Interfaces New Directions - Home | Storage Networking

f u s i o n i o . c o m | R E D E F I N E W H A T ’ S P O S S I B L E

Q U E S T I O N S ?

Page 27: NVM Software Interfaces New Directions - Home | Storage Networking

f u s i o n i o . c o m | R E D E F I N E W H A T ’ S P O S S I B L E

T H A N K Y O U