Upload
others
View
2
Download
0
Embed Size (px)
Citation preview
2017 Storage Developer Conference. © CNEX Labs. All Rights Reserved.1
Open-Channel SSDs Offer the Flexibility
Required by Hyperscale Infrastructure
Matias Bjørling
CNEX Labs
2017 Storage Developer Conference. © CNEX Labs. All Rights Reserved.2
Public and Private Cloud Providers
2017 Storage Developer Conference. © CNEX Labs. All Rights Reserved.3
Workloads and Applications
Hardware
Insta
nce
Insta
nce
Insta
nce
Insta
nce
Databases
Multi-Tenancy
Read or Write-Heavy
2017 Storage Developer Conference. © CNEX Labs. All Rights Reserved.4
NAND Capacity Grows Bigger
Performance – Endurance – DRAM overheads
2017 Storage Developer Conference. © CNEX Labs. All Rights Reserved.5
Read Latency with 0% WritesRandom Read 4K
Percentiles
Re
ad
La
tency
2017 Storage Developer Conference. © CNEX Labs. All Rights Reserved.6
Read Latency with 20% Writes
6
Percentiles
Random Read 4K + Random Write 4K
Signficant outliers!
Worst-case 30X
4ms!R
ead
La
tency
2017 Storage Developer Conference. © CNEX Labs. All Rights Reserved.7
Requirements
1. Flexible enough for software to evolve faster than hardware
2. Strong QoS Guarantees
3. Continuous access – No maintenance windows
4. Support a broad set of applications on shared hardware
5. Rapid enablement of new NAND generations / media agnostic
6. Vendor neutrality & supply chain diversity
2017 Storage Developer Conference. © CNEX Labs. All Rights Reserved.8
Open-Channel SSDs
I/O IsolationEnable I/O isolation
between tenants by
allocating your SSD into
separate parallel units.
Predictable LatencyNo more guessing when an
I/O completes. You know
which parallel unit is
accessed on disk.
Data Placement & I/O SchedulingManage your non-volatile memory
as a block device, through a file-
system or inside your application.
2017 Storage Developer Conference. © CNEX Labs. All Rights Reserved.9
Rebalance the Storage Interface
Expose internal lanes to host
Each lane is a parallel unit
Logical or Physical
Performance characteristics
Building Block
Similar to HDD SMR interface
Extension to NVMe
Can implement I/O Determinism,
streams, and other new data
management schemes
Open specification
2017 Storage Developer Conference. © CNEX Labs. All Rights Reserved.10
Rebalance the Storage Interface
Expose geometry
Logical/Physical geometry
Performance characteritics
Controller functionalities
Similar to HDD SMR interface
Exposes parallel units to the host
Enable efficient access to the parallel unitsEncode parallel units into
the address space
# Channels, # Parallel Units, #
Chunk, Chunk Size, Min. Write
size, Optimal Write size, …
LBAXLBA0PU0 PU1 PU2 PU3
10
Up to the SSD vendor
2017 Storage Developer Conference. © CNEX Labs. All Rights Reserved.11
Solid-State Drive
Parallel Units
Flash Translation Layer Channel X
Media Error Handling
Media Retention Management
Me
dia
Co
ntr
olle
rResponsibilities
Host Interface
Channel Y
Internals of an SSD
Read/Write/Erase
Read/Write
Tens of Parallel
Units!
Transforms
R/W/E to R/W
Manage Media
Constraints
ECC, RAID,
Retention
Read (50-100us)
Write (1-10ms)
Erase (3-15ms)
11
2017 Storage Developer Conference. © CNEX Labs. All Rights Reserved.12
Move Data Placement & I/O Scheduling to
the Host
Solid-State Drive
LUNs
Flash Translation Layer Channel X
Media Error Handling
Media Retention Management
Me
dia
Contr
ollerResponsibilities
Host Interface
Channel Y
Read (50-100us)
Write (1-10ms)
Erase (3-15ms)
12
2017 Storage Developer Conference. © CNEX Labs. All Rights Reserved.13
Move Data Placement & I/O Scheduling to
the Host
Solid-State Drive
LUNs
Flash Translation Layer
Channel X
Media Error Handling
Media Retention Management
Me
dia
Contr
ollerResponsibilities
Host Interface
Channel Y
Read (50-100us)
Write (1-10ms)
Erase (3-15ms)
13
2017 Storage Developer Conference. © CNEX Labs. All Rights Reserved.16
Supervised Parking Lot
16
1h 1h 24h
24h24h
1h 1h
1h
1h
24h 1h
2017 Storage Developer Conference. © CNEX Labs. All Rights Reserved.17
Open-Channel SSD 2.0 Specification
Expose a description of parallelism within SSD
Expose #Channels, #LUNs (PUs), #Chunks, Write sizes, …
Enhanced specification from customer requests and SSD vendors
Simplified
Log-structured. Write sequentially within a chunk
Minimal write size, optimal write size.
Media-agnostic
Work on NAND as well as PCM and other non-volatile memories
Exposed through similar constructs as Zones for SMR hard-drives
2017 Storage Developer Conference. © CNEX Labs. All Rights Reserved.18
Drive Model - Chunks
A chunk defines the area where writes must be sequential
Reads
Sector size
Writes
Minimum write size granularity
Synchronous – May fail – An error only marks write bad, and not the full SSD
Reset Chunk (Erase)
Synchronous – May fail – An error only marks chunk bad, and not the full SSD
0 1 LBA -1
0 1 … Chunk - 1
LBA
Chunk
2017 Storage Developer Conference. © CNEX Labs. All Rights Reserved.19
Drive Model - Organization
Parallelism exposed over
Channels (Shared bus)
LUNs – Parallel Units
Logical or Physical representation
0 1 … LBA -1
0 1 … Chunk - 1
LBA
Chunk
0 1 … LUN - 1
0 1 … Channel - 1
LUN
Channel
SSD
LUNLUNLUN
LUNLUNLUN
Chunk Chunk
NVMe
Host
2017 Storage Developer Conference. © CNEX Labs. All Rights Reserved.20
LightNVM Architecture
NVMe Device Driver
Detection of OCSSD
Implements OCSSD interface
LightNVM Subsystem
Core functionality – ”Partition manager”
Target management (e.g., pblk)
Sysfs integration
High-level I/O Interface
Block device using pblk
Application integration with liblightnvm
Open-Channel SSD
NVMe Device Driver
LightNVM Subsystem
pblk
Hardware
Kernel
Space
User
SpaceApplication(s)
File System
PPA Addressing
Sca
lar
Re
ad
/Write
(op
tio
na
l)
Ge
om
etr
y
Ve
cto
red
R/W
/E
(2)
(1)
(3)
20
2017 Storage Developer Conference. © CNEX Labs. All Rights Reserved.21
Host-side Flash Translation Layer - pblk
Multi-target support - I/O isolation
Fully associative L2P table (4KB mapping)
Host-side write buffer to guarantee read and writes
Cost-based garbage collector, using valid sector count as
metric.
Capacity based rate limiter. Designed as a function of
user and GC I/O present in write buffer.
Scan-based L2P recovery. Scan metadata in closed lines
and OOB on open lines.
sysfs interface for statistics and FTL tuning.
21
Open-Channel SSD
NVMe Device Driver
LightNVM Subsystem
Hardware
Linux
KernelFile System
make_rq make_rq
Write ThreadError Handling
L2P Table
Write Context
Write BufferWrite Entry
Write
LookupCache Hit
Read
Add Entry
GC/Rate-limitingThread
Read Path Write Path
2017 Storage Developer Conference. © CNEX Labs. All Rights Reserved.22
22
*Not final performance numbers
Raw Performance
0
500
1000
1500
2000
2500
3000
3500
4000
4500
5000
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16
MB
/s
Channels in use
Read/Write Throughput
Write (MB/s)
Read (MB/s)
2017 Storage Developer Conference. © CNEX Labs. All Rights Reserved.23
Predictable Latency
4K reads during 64K concurrent writes
Consistent low latency at 99.99, 99.999, 99.9999
2017 Storage Developer Conference. © CNEX Labs. All Rights Reserved.24
Demo using Open-Channel SSD
4K Reads 256K Writes
Open-Channel SSD
(Instance-based)
4K Reads 256K Writes
Solid State Drive
2017 Storage Developer Conference. © CNEX Labs. All Rights Reserved.27
Scalability
0
50
100
150
200
250
300
350
0 10 20 30 40 50 60 70 80 90 100
Lat
ency
(m
icro
seco
nds)
Percentiles
3 writes + 1 read 7 writes + 1 read 15 writes + 1 read
60 writes + 1 Read 120 Writes + 1 Read
2017 Storage Developer Conference. © CNEX Labs. All Rights Reserved.28
liblightnvm User-space interface to interact with vectoring using
Open-Channel SSDs
Easy to use interface for geometry layout and
I/O accesses
CLI Interface
struct nvm_dev
struct nvm_geo
struct nvm_addr
struct nvm_vblk
…
Terrific set of CLI tools
Great tutorial available
http://lightnvm.io/liblightnvm/tutorial/
28
Kernel-space
User-space
Open-Channel SSD
NVMe Device Driver
liblightnvm
I/O Apps
LightNVM Subsystem
2017 Storage Developer Conference. © CNEX Labs. All Rights Reserved.29
Status
Active community
Multiple drives in development by commercial SSD vendors
Multiple papers on Open-Channel SSDs
Growing software stack
LightNVM subsystem since Linux kernel 4.4.
User-space library (liblightnvm) support from Linux kernel 4.11.
pblk available from Linux kernel 4.12.
2017 Storage Developer Conference. © CNEX Labs. All Rights Reserved.30
Conclusion
New storage interface that solves the software-defined storage
challanges for hyperscalers
I/O Predictability
I/O Isolation
Application-level control of data placement and I/O scheduling
Draft specification is open and available for implementers
Formal standards initiative underway, target spec release 1H
2018
More information available at: http://lightnvm.io