51
Fusion-io Confidential—Copyright © 2013 Fusion-io, Inc. All rights reserved. Fusion-io Confidential—Copyright © 2013 Fusion-io, Inc. All rights reserved. Running NoSQL natively on flash Thomas Rochner [email protected]

NoSQL Munich 2013nosqlroadshow.com/dl/NoSQL-Munich-2013/fusion-io munich... · 2013-04-17 · Topics – NoSQL Munich 2013 1. What are we building ? 2. Why are we building it? 3

  • Upload
    others

  • View
    17

  • Download
    0

Embed Size (px)

Citation preview

Page 1: NoSQL Munich 2013nosqlroadshow.com/dl/NoSQL-Munich-2013/fusion-io munich... · 2013-04-17 · Topics – NoSQL Munich 2013 1. What are we building ? 2. Why are we building it? 3

Fusion-io Confidential—Copyright © 2013 Fusion-io, Inc. All rights reserved. Fusion-io Confidential—Copyright © 2013 Fusion-io, Inc. All rights reserved.

Running NoSQL natively on flash Thomas Rochner

[email protected]

Page 2: NoSQL Munich 2013nosqlroadshow.com/dl/NoSQL-Munich-2013/fusion-io munich... · 2013-04-17 · Topics – NoSQL Munich 2013 1. What are we building ? 2. Why are we building it? 3

Topics – NoSQL Munich 2013

1.  What are we building ?

2.  Why are we building it?

3.  ioMemory SDK

4.  KV-API

5.  Direct FS

6.  Memory Access Semantics

7.  Where are we headed?

April 17, 2013 2

Page 3: NoSQL Munich 2013nosqlroadshow.com/dl/NoSQL-Munich-2013/fusion-io munich... · 2013-04-17 · Topics – NoSQL Munich 2013 1. What are we building ? 2. Why are we building it? 3

⌃ 2006

⌃ 2007

⌃ 2008

⌃ 2009

⌃ 2010

⌃ 2011

⌃ 2012

⌃ 2013

Mission to consolidate memory and storage

ioMemory technology unveiled

First products launched

1 million IOPS IBM Quicksilver

Dell strategic investment

HP OEMs products

IBM OEMs products

Samsung strategic investment

Dell OEMs products

VSL introduced

IPO on NYSE ioTurbine acquired

ioDrive2 announced

Supermicro OEMs products

1 Billion IOPS 2,500 customers

>120 channel and alliance partners ioFX announced

ioMemory SDK introduced Cisco OEMs products

ioScale announced at Open Compute Summit

Fusion-io First Mover Milestones

April 17, 2013 3

Page 4: NoSQL Munich 2013nosqlroadshow.com/dl/NoSQL-Munich-2013/fusion-io munich... · 2013-04-17 · Topics – NoSQL Munich 2013 1. What are we building ? 2. Why are we building it? 3

Fusion-io Accelerates

April 17, 2013 4

Analytics Search

ORACLE Text

Messaging

MQ

Databases

INFORMIX

Virtualization

KVM

HPC

GPFS

Big Data

Security/Logging

Collaboration

Lotus

Development Web

LAMP

Caching Workstation

Page 5: NoSQL Munich 2013nosqlroadshow.com/dl/NoSQL-Munich-2013/fusion-io munich... · 2013-04-17 · Topics – NoSQL Munich 2013 1. What are we building ? 2. Why are we building it? 3

What is Fusion-io?

▸  A New Memory tier called ioMemory •  Leverages the best advantages of DRAM and rotating drives

▸  High Speed near like DRAM ▸  Persistence and Large capacity of Spinning Hard Drives

▸  PCIe based NAND Flash storage ▸  Micro-second level Disk Access Latency - 15µs ▸  Very high data throughput - 1,5GB/s ▸  Very high IOPS – 535.000 – 800.000 random write/s ▸  Scalable – stay ahead of data / performance demands ▸  Advanced wear-leveling algorithm ▸  N+1 Chip level redundancy (think RAID protection on card)100% data integrity

protection in case of power loss ▸  Endurance is PBW – TB’s written daily for more than 8 years!

Page 6: NoSQL Munich 2013nosqlroadshow.com/dl/NoSQL-Munich-2013/fusion-io munich... · 2013-04-17 · Topics – NoSQL Munich 2013 1. What are we building ? 2. Why are we building it? 3

Up to 3 TB of capacity per x4 PCI Express slot

Up to 2.4TB of capacity per x8 PCI Express slot

Up to 10.24TB to maximize performance for large data sets

1650 GB of workstation acceleration

Up to 1.2TB for maximum performance density

Up to 3.2 of Hyperscale Acceleration

Direct Acceleration

April 17, 2013 6

MEZZANINE

Page 7: NoSQL Munich 2013nosqlroadshow.com/dl/NoSQL-Munich-2013/fusion-io munich... · 2013-04-17 · Topics – NoSQL Munich 2013 1. What are we building ? 2. Why are we building it? 3

Topics – NoSQL Munich 2013

1.  What are we building ?

2.  Why are we building it?

3.  ioMemory SDK

4.  KV-API

5.  Direct FS

6.  Memory Access Semantics

7.  Where are we headed?

April 17, 2013 7

Page 8: NoSQL Munich 2013nosqlroadshow.com/dl/NoSQL-Munich-2013/fusion-io munich... · 2013-04-17 · Topics – NoSQL Munich 2013 1. What are we building ? 2. Why are we building it? 3

April 17, 2013 8

L1-L3 Cache 10 ns

DRAM 100 ns

Fusion-io 15 µs

SAN 4 ms

Blink of an eye 1/10 second

Get Coffee 2,5 minutes

Fly to Australia 11,1 hours

Heartbeat 1 second

Very Small - Kb Very expensive Volatile Non volatile

Multiplier is 10m - 10.000.000

The landscape of sub second Timings. How fast do you get data to the factory?

Page 9: NoSQL Munich 2013nosqlroadshow.com/dl/NoSQL-Munich-2013/fusion-io munich... · 2013-04-17 · Topics – NoSQL Munich 2013 1. What are we building ? 2. Why are we building it? 3

37% of servers are massively underutilized¹…

SLOW STORAGE LEADS TO IDLE CAPACITY

According to Moore’s Law, processing performance doubles every 18 months

1 Source: IDC's Server Workloads 2010, July 2010 2 Source: Taming the Power Hungry Data Center, Fusion-io White Paper

Server is idle 80% of the time

CPUs

Storage Rel

ativ

e Pe

rfor

man

ce

…because the performance gap continues to grow

2000 2005 1985 1990 1995 2010

Growing performance

gap²

April 17, 2013 © 2012 Fusion-io, Inc. All rights reserved. 9

Page 10: NoSQL Munich 2013nosqlroadshow.com/dl/NoSQL-Munich-2013/fusion-io munich... · 2013-04-17 · Topics – NoSQL Munich 2013 1. What are we building ? 2. Why are we building it? 3

SPINNING MEDIA OVER 150 YEARS OLD

Page 11: NoSQL Munich 2013nosqlroadshow.com/dl/NoSQL-Munich-2013/fusion-io munich... · 2013-04-17 · Topics – NoSQL Munich 2013 1. What are we building ? 2. Why are we building it? 3

SSD treats memory like disk

© Fusion-io

Page 12: NoSQL Munich 2013nosqlroadshow.com/dl/NoSQL-Munich-2013/fusion-io munich... · 2013-04-17 · Topics – NoSQL Munich 2013 1. What are we building ? 2. Why are we building it? 3

PC

Ie

DRAM

Host CPU

App OS

FLASH AS DISK FLASH AS MEMORY

Flash Architectures

PC

Ie

SA

S

DRAM

Data path Controller

NAND

Host CPU

RAID Controller

App OS

SC

12 April 17, 2013

Page 13: NoSQL Munich 2013nosqlroadshow.com/dl/NoSQL-Munich-2013/fusion-io munich... · 2013-04-17 · Topics – NoSQL Munich 2013 1. What are we building ? 2. Why are we building it? 3

Cut-through Architecture and VSL

April 17, 2013 13

▸ Sophisticated architecture •  maximum performance

▸  Intelligent software •  advanced features

Kernel

File System

Virtual Storage Layer (VSL)

ioMemory

Applications/Databases PCIe

DRAM / Memory / Operating System and Application Memory io

Mem

ory

Virt

ualiz

atio

n Ta

bles

Channels Wide

Ban

ks

ioDrive ioMemory Data-Path Controller

Commands

Host

Virtual Storage Layer (VSL)

DA

TA

TR

AN

SF

ER

S

CPU and cores

Page 14: NoSQL Munich 2013nosqlroadshow.com/dl/NoSQL-Munich-2013/fusion-io munich... · 2013-04-17 · Topics – NoSQL Munich 2013 1. What are we building ? 2. Why are we building it? 3

Balanced Performance Affects Throughput

SSD

ioScale

Queuing behind slow writes causes SSD latency spikes

ioMemory balances read/write performance for consistent throughput

14 April 17, 2013

Page 15: NoSQL Munich 2013nosqlroadshow.com/dl/NoSQL-Munich-2013/fusion-io munich... · 2013-04-17 · Topics – NoSQL Munich 2013 1. What are we building ? 2. Why are we building it? 3

Topics – NoSQL Munich 2013

1.  What are we building ?

2.  Why are we building it?

3.  ioMemory SDK

4.  KV-API

5.  Direct FS

6.  Memory Access Semantics

7.  Where are we headed?

April 17, 2013 15

Page 16: NoSQL Munich 2013nosqlroadshow.com/dl/NoSQL-Munich-2013/fusion-io munich... · 2013-04-17 · Topics – NoSQL Munich 2013 1. What are we building ? 2. Why are we building it? 3

NoSQL Software challenges

•  Keeping NoSQL software simplicity with data persistence

•  Transforming in-memory structures to block I/O

•  Tiering data between DRAM and persistent storage

•  Keeping latency low with data persistence

•  Scaling up

April 17, 2013 16

Page 17: NoSQL Munich 2013nosqlroadshow.com/dl/NoSQL-Munich-2013/fusion-io munich... · 2013-04-17 · Topics – NoSQL Munich 2013 1. What are we building ? 2. Why are we building it? 3

ioMemory Software Development Kit

•  Native programming interfaces

•  Access flash as a memory

•  Eliminate legacy software layers

•  Simplify application authoring

•  Accelerate time-to-market April 17, 2013 © 2012 Fusion-io, Inc. All rights reserved. 17

Page 18: NoSQL Munich 2013nosqlroadshow.com/dl/NoSQL-Munich-2013/fusion-io munich... · 2013-04-17 · Topics – NoSQL Munich 2013 1. What are we building ? 2. Why are we building it? 3

NVM Software interfaces ▸  Industry-first, direct API access to non-volatile memory’s unique

characteristics.

▸  The ioMemory SDK was introduced to help developers:

•  Write less code to create high-performing apps

•  Tap into performance not available with conventional I/O access to SSDs

•  Reduce operating costs by decreasing RAM while increasing NVM

April 17, 2013 18

Page 19: NoSQL Munich 2013nosqlroadshow.com/dl/NoSQL-Munich-2013/fusion-io munich... · 2013-04-17 · Topics – NoSQL Munich 2013 1. What are we building ? 2. Why are we building it? 3

Direct-Access to non-volatile memory is now emerging

▸ Developers are beginning to manipulate data

directly in Non-Volatile Memory (NVM)

without converting to basic block I/O.

April 17, 2013 19

Page 20: NoSQL Munich 2013nosqlroadshow.com/dl/NoSQL-Munich-2013/fusion-io munich... · 2013-04-17 · Topics – NoSQL Munich 2013 1. What are we building ? 2. Why are we building it? 3

Conventional I/O access APPLICATION

Application source code translates native data structures into block I/O

Block I/O

Network File

Block I/O

April 17, 2013 20

Traditional Storage

Proprietary Storage OS

Storage Media

Open Interface Layer

Storage Media

Software Defined Storage

Conventional I/O access

Page 21: NoSQL Munich 2013nosqlroadshow.com/dl/NoSQL-Munich-2013/fusion-io munich... · 2013-04-17 · Topics – NoSQL Munich 2013 1. What are we building ? 2. Why are we building it? 3

native nvm access: I/O APPLICATION

Application source code

Application source code does I/O with native data structures

Atomic I/O Transaction

User-Defined Object

Key-Value Object

Block I/O

Network File

Block I/O

April 17, 2013 21

Traditional Storage

Proprietary Storage OS

Storage Media

Open Interface Layer

Storage Media

Software Defined Storage

Conventional I/O access

Direct access I/O

Page 22: NoSQL Munich 2013nosqlroadshow.com/dl/NoSQL-Munich-2013/fusion-io munich... · 2013-04-17 · Topics – NoSQL Munich 2013 1. What are we building ? 2. Why are we building it? 3

native nvm access: persistent memory APPLICATION

Application source code Application source code

Application source code manipulates data structures natively in NV Memory

High-speed Log

Checkpointed Memory

Memory Transaction Atomic I/O

Transaction User-

Defined Object

Key-Value Object

Block I/O

Network File

Block I/O

April 17, 2013 22

Traditional Storage

Proprietary Storage OS

Storage Media

Open Interface Layer

Storage Media

Software Defined Storage

Conventional I/O access

NV Memory access

Direct access I/O

Page 23: NoSQL Munich 2013nosqlroadshow.com/dl/NoSQL-Munich-2013/fusion-io munich... · 2013-04-17 · Topics – NoSQL Munich 2013 1. What are we building ? 2. Why are we building it? 3

Tradi&onal*SSDs*

ioMemory™*with*Conven&onal*I/O*

ioMemory™*as*Transparent*Cache*

ioMemory™*with**direct*access*I/O*

ioMemory™*with**memory*seman&cs**

Applica&on*

Applica&on*

Applica&on*

Applica&on* Applica&on* Applica&on* Applica&on*

User?defined*I/O*API*Libraries*

User?defined*Memory*API*Libraries*

OS*Block*I/O* OS*Block*I/O* OS*Block*I/O* Direct?access*I/O*API*Libraries*

Memory*Seman&cs**API*Libraries*

Host*

Host*

File*System* File*System* File*System*directFS*–**

NVM*filesystem*directFS*–*

NVM*filesystem*

***Block*Layer* Block*Layer* Block*Layer* I/O*Primi&ves* Memory*Primi&ves*

SAS/SATA*VSL™**

expanded*flash*transla&on*layer*

directCache™*

VSL™* VSL™*Network*

VSL™*

Remote*

RAID*Controller*

Read/Write* Read/Write* Read/Write* Read/Write* CPU*Load/Store*Flash*

Transla&on*Layer*

Read/Write*

Na&ve*Access*|*

23 April 17, 2013

Flash memory evolution

Page 24: NoSQL Munich 2013nosqlroadshow.com/dl/NoSQL-Munich-2013/fusion-io munich... · 2013-04-17 · Topics – NoSQL Munich 2013 1. What are we building ? 2. Why are we building it? 3

Topics – NoSQL Munich 2013

1.  What are we building ?

2.  Why are we building it?

3.  ioMemory SDK

4.  KV-API

5.  Direct FS

6.  Memory Access Semantics

7.  Where are we headed?

April 17, 2013 24

Page 25: NoSQL Munich 2013nosqlroadshow.com/dl/NoSQL-Munich-2013/fusion-io munich... · 2013-04-17 · Topics – NoSQL Munich 2013 1. What are we building ? 2. Why are we building it? 3

Example: Key-Value Store API Library

April 17, 2013 25

key value 1-128B 64b-1MB

kv_put() kv_get() or kv_batch_get()

key expiration timer marks KV pair for VSL garbage collection

key hashed into sparse address

space to simplify collision

management

value returned through single I/O operation, regardless of

value size

key value key value key value

pool A pool B

pool C

kv_get_current(), kv_next()

Iterate through each KV pair in

a pool of related keys

Atomic transaction envelope

Application issues call to Key-Value Store API

key

value

Virtual Storage Layer

Page 26: NoSQL Munich 2013nosqlroadshow.com/dl/NoSQL-Munich-2013/fusion-io munich... · 2013-04-17 · Topics – NoSQL Munich 2013 1. What are we building ? 2. Why are we building it? 3

Key-Value Store API Library Benchmarks (native KV Get/Put vs. raw reads/writes)

0"

20000"

40000"

60000"

80000"

100000"

120000"

140000"

0" 20" 40" 60" 80" 100" 120" 140"

GETs/s&

Threads&

Sample&Performance&5&GET&

512B"

4KB"

16KB"

64KB"

0"

20000"

40000"

60000"

80000"

100000"

120000"

0" 20" 40" 60" 80" 100" 120" 140"

PUTs/

s&

Threads&

Sample&Performance&4&PUT&

512B"

4KB"

16KB"

64KB"

0"

20000"

40000"

60000"

80000"

100000"

120000"

140000"

0" 20" 40" 60" 80"

OPS/s

&

Threads&

Performane&rela2ve&to&ioDrive&

512B"Key"GET"

1KB"FIO"READ"

0"

20000"

40000"

60000"

80000"

100000"

120000"

0" 10" 20" 30" 40" 50" 60" 70"

OPS/s

&

Threads&

Performance&rela3ve&to&ioDrive&

512B"Key"PUT"

1K2FIO"WRITE"

Sample Performance - GET

Performance relative to ioDrive

Sample Performance - PUT

Performance relative to ioDrive

1U HP blade server with 16 GB RAM, 8 CPU cores - Intel(R) Xeon(R) CPU X5472 @ 3.00GHz with single 1.2 TB ioDrive2 mono

SIGNIF ICANTLY MORE FUNCTIONALITY WITH NEGLIGIBLE PERFORMANCE COST

April 17, 2013 26

Page 27: NoSQL Munich 2013nosqlroadshow.com/dl/NoSQL-Munich-2013/fusion-io munich... · 2013-04-17 · Topics – NoSQL Munich 2013 1. What are we building ? 2. Why are we building it? 3

Key-Value Store API library Benchmarks: (vs memcachedb)

April 17, 2013 27

Page 28: NoSQL Munich 2013nosqlroadshow.com/dl/NoSQL-Munich-2013/fusion-io munich... · 2013-04-17 · Topics – NoSQL Munich 2013 1. What are we building ? 2. Why are we building it? 3

Membase ioDrive vs. HDD

April 17, 2013 28

Page 29: NoSQL Munich 2013nosqlroadshow.com/dl/NoSQL-Munich-2013/fusion-io munich... · 2013-04-17 · Topics – NoSQL Munich 2013 1. What are we building ? 2. Why are we building it? 3

MongoDB cache warmup

April 17, 2013 29

Page 30: NoSQL Munich 2013nosqlroadshow.com/dl/NoSQL-Munich-2013/fusion-io munich... · 2013-04-17 · Topics – NoSQL Munich 2013 1. What are we building ? 2. Why are we building it? 3

Key-value store API Library: Sample Uses and Benefits

April 17, 2013 30

▸ NoSQL Applications Increase performance by eliminating packing and unpacking blocks, defragmentation, and duplicate metadata at app layer. Reduce application I/O through batched put and get operations. Reduce overprovisioning due to lack of coordination between two-layers of garbage collection (application-layer and flash-layer). Some top NoSQL applications recommend over-provisioning by 3x due to this.

•  95% performance of raw device Smarter media now natively understands a key-value I/O interface with lock-free updates, crash recovery, and no additional metadata overhead.

•  Up to 3x capacity increase Dramatically reduces over-provisioning with coordinated garbage collection and automated key expiry.

•  3x throughput on same SSD Early benchmarks comparing against memcached with BerkeleyDB persistence show up to 3x improvement.

Page 31: NoSQL Munich 2013nosqlroadshow.com/dl/NoSQL-Munich-2013/fusion-io munich... · 2013-04-17 · Topics – NoSQL Munich 2013 1. What are we building ? 2. Why are we building it? 3

Topics – NoSQL Munich 2013

1.  What are we building ?

2.  Why are we building it?

3.  ioMemory SDK

4.  KV-API

5.  Direct FS

6.  Memory Access Semantics

7.  Where are we headed?

April 17, 2013 31

Page 32: NoSQL Munich 2013nosqlroadshow.com/dl/NoSQL-Munich-2013/fusion-io munich... · 2013-04-17 · Topics – NoSQL Munich 2013 1. What are we building ? 2. Why are we building it? 3

directFS – Eliminating duplicate logic

April 17, 2013 32

Linux VFS

Kernel Block Layer

Device Driver

Sector Mapping

Met

a da

ta

ext / btrfs / xfs DirectFS

Driver Primitives

Meta data

Page 33: NoSQL Munich 2013nosqlroadshow.com/dl/NoSQL-Munich-2013/fusion-io munich... · 2013-04-17 · Topics – NoSQL Munich 2013 1. What are we building ? 2. Why are we building it? 3

DIRECTFS – Benefits in Eliminating Duplicate logic

April 17, 2013 33

File System Lines of Code

directFS 6879

ReiserFS 19996

ext4 25837

btrfs 51925

XFS 63230

Page 34: NoSQL Munich 2013nosqlroadshow.com/dl/NoSQL-Munich-2013/fusion-io munich... · 2013-04-17 · Topics – NoSQL Munich 2013 1. What are we building ? 2. Why are we building it? 3

directFS: Native Flash Filesystem

▸  File system convenience

▸  Raw block performance

▸  No compromise necessary

Ban

dwid

th (M

iB/s

) I/O Size

Raw Block directFS

8 Threads

34 April 17, 2013

Page 35: NoSQL Munich 2013nosqlroadshow.com/dl/NoSQL-Munich-2013/fusion-io munich... · 2013-04-17 · Topics – NoSQL Munich 2013 1. What are we building ? 2. Why are we building it? 3

directFS: Consistent Low Latency

Sample

Late

ncy

(µs)

Fusion-io DFS vs EXT4 write latency 1000 512 Byte Sequential Write with O_DIRECT

DFS EXT4

35 April 17, 2013

Page 36: NoSQL Munich 2013nosqlroadshow.com/dl/NoSQL-Munich-2013/fusion-io munich... · 2013-04-17 · Topics – NoSQL Munich 2013 1. What are we building ? 2. Why are we building it? 3

MySQL: directFS and Atomic Writes

Double-write disabled – Non-ACID

Double-write – ACID

XFS directFS with Atomic I/O

50% More ACID transactions

Atomic writes – ACID

36 April 17, 2013

Page 37: NoSQL Munich 2013nosqlroadshow.com/dl/NoSQL-Munich-2013/fusion-io munich... · 2013-04-17 · Topics – NoSQL Munich 2013 1. What are we building ? 2. Why are we building it? 3

Case Study: Percona SERVER (MySQL) ▸ Percona has added atomics support to Percona

Server 5.5 •  Removes the need of the MySQL double write buffer

•  Ensures data integrity in case of system crashes

•  Writes 50% less, great for flash

•  Removes complexity from the software stack

•  Improves both transaction bandwidth and latency

•  Works though the DirectFS filesystem or on RAW devices

April 17, 2013 37

Page 38: NoSQL Munich 2013nosqlroadshow.com/dl/NoSQL-Munich-2013/fusion-io munich... · 2013-04-17 · Topics – NoSQL Munich 2013 1. What are we building ? 2. Why are we building it? 3

Topics – NoSQL Munich 2013

1.  What are we building ?

2.  Why are we building it?

3.  ioMemory SDK

4.  KV-API

5.  Direct FS

6.  Memory Access Semantics

7.  Where are we headed?

April 17, 2013 38

Page 39: NoSQL Munich 2013nosqlroadshow.com/dl/NoSQL-Munich-2013/fusion-io munich... · 2013-04-17 · Topics – NoSQL Munich 2013 1. What are we building ? 2. Why are we building it? 3

Range of memory-Access Semantics

April 17, 2013 39

Extended Memory Volatile

Transparently extends DRAM onto flash, extending application virtual memory

Checkpointed Memory

Volatile with non-volatile checkpoints

Region of application virtual memory with ability to preserve snapshots to flash namespace

Auto-Commit Memory™ Non-volatile

Region of application memory automatically persisted to non-volatile memory and recoverable post-system failure

Page 40: NoSQL Munich 2013nosqlroadshow.com/dl/NoSQL-Munich-2013/fusion-io munich... · 2013-04-17 · Topics – NoSQL Munich 2013 1. What are we building ? 2. Why are we building it? 3

OS Swap vs. Extended Memory

•  Originally designed as a last resort to prevent OOM (out-of-memory) failures •  Never tuned for high-performance demand-paging •  Never tuned for multi-threaded apps •  Poor performance, ex. < 30 MB/sec throughput

•  No application code changes required •  Designed to migrate hot pages to DRAM and cold pages to ioMemory •  Tuned to run natively on flash (leverages native characteristics) •  Tuned for multi-threaded apps •  10-15x throughput improvement over standard OS Swap

April 17, 2013 40

Non-Volatile Storage (Disks, SSDs, etc.)

System Memory

OS SWAP Mechanism

NV Memory (volatile usage)

System Memory

Extended Memory Mechanism

Page 41: NoSQL Munich 2013nosqlroadshow.com/dl/NoSQL-Munich-2013/fusion-io munich... · 2013-04-17 · Topics – NoSQL Munich 2013 1. What are we building ? 2. Why are we building it? 3

Virtual Storage Layer

Application issues calls to transactional memory interface

d d d d

Example: Memory transaction primitives

April 17, 2013 41

1. Allocate pages app virtual address space

CPU loads/stores data to ACMTM

pages

Process allocate ACMTM pages and maps to virtual address space

2. memcpy() Process writes and reads to pages through standard memory-access semantics

3. Checkpoint pages

d d d d

d d d d

System

failure event

5. Restore state app virtual address space

Re-started process maps restored snapshot pages into virtual address space

Application-consistent snapshot pages are created or updated

working pages

snapshot pages

d d d d

d d d d

d d d d

System failure automatically preserves snapshot pages

d d d d 4. Destage pages

d d d d Completed

pages

d d d d Pages persisted

to flash namespace

Page 42: NoSQL Munich 2013nosqlroadshow.com/dl/NoSQL-Munich-2013/fusion-io munich... · 2013-04-17 · Topics – NoSQL Munich 2013 1. What are we building ? 2. Why are we building it? 3

Comparing i/o and memory access semantics

April 17, 2013 42

I/O

I/O semantics examples: •  Open file descriptor – open(), read(), write(), seek(), close() •  (New) Write multiple data blocks atomically, nvm_vectored_write() •  (New) Open key-value store – nvm_kv_open(), kv_put(), kv_get(),

kv_batch_*()

Memory Access

(Volatile)

Volatile memory semantics example: •  Allocate virtual memory, e.g. malloc() •  memcpy/pointer dereference writes (or reads) to memory address •  (Improved) Page-faulting transparently loads data from NVM into memory

Memory Access

(Non-

Volatile)

Non-volatile memory semantics example: •  (New) Allocate and map Auto-Commit Memory™ (ACM) virtual memory pages •  memcpy/pointer dereference writes (or reads) to memory address •  (New) Call checkpoint() to create application-consistent ACM page snapshots •  (New) After system failure, remap ACM snapshot pages to recover memory

state •  (New) De-stage completed ACM pages to NVM namespace •  (New) Remap and access ACM pages from NVM namespace at any time

Page 43: NoSQL Munich 2013nosqlroadshow.com/dl/NoSQL-Munich-2013/fusion-io munich... · 2013-04-17 · Topics – NoSQL Munich 2013 1. What are we building ? 2. Why are we building it? 3

Native NVM Access |

43 April 17, 2013

Flash memory evolution Traditional SSDs ioMemory™ with

Conventional I/O ioMemory™ as

Transparent Cache ioMemory™ with direct access I/O

ioMemory™ with memory semantics

AP

PL

ICA

TIO

N

Application

AP

PL

ICA

TIO

N

Application Application

Application Application

User-defined I/O API

Libraries User-defined Memory

API Libraries

OS Block I/O OS Block I/O OS Block I/O Direct-access I/O API Libraries

Memory Semantics API Libraries

HO

ST

File System

HO

ST

File System File System directFS – NVM filesystem

directFS – NVM filesystem

Block Layer Block Layer Block Layer I/O Primitives Memory Primitives

SAS/SATA VSL™ expanded flash translation layer

directCache™ VSL™ VSL™

Network VSL™

RE

MO

TE

RAID Controller

Read/Write Read/Write Read/Write Read/Write CPU Load/Store Flash Translation

Layer

Read/Write

Page 44: NoSQL Munich 2013nosqlroadshow.com/dl/NoSQL-Munich-2013/fusion-io munich... · 2013-04-17 · Topics – NoSQL Munich 2013 1. What are we building ? 2. Why are we building it? 3

Open NVM development Interfaces

April 17, 2013 44

I/O Semantics (explicit persistence) Memory Semantics (implicit persistence)

API Libraries: Open Source

Open Interface

directKV-s (key-value store)

directKV-c (key-value cache)

User-Defined Libraries

memLOG (high-speed logging)

memSnap (Checkpointed

Memory)

POSIX Filesystem: Open Source

directFS (POSIX filesystem namespace implemented with NVM Primitives)

NVM Primitives: Open Interface

Basic Block I/O Primitives

Atomic I/O Transaction Primitives

Sparse-address

Primitives

Clone/ Merge

Primitives

Cache Eviction

Primitives

Memory Transaction Primitives

Virtual Partition Primitives (partition creation, striping, and mirroring)

NVM or Flash Translation Layer

Page 45: NoSQL Munich 2013nosqlroadshow.com/dl/NoSQL-Munich-2013/fusion-io munich... · 2013-04-17 · Topics – NoSQL Munich 2013 1. What are we building ? 2. Why are we building it? 3

Topics – NoSQL Munich 2013

1.  What are we building ?

2.  Why are we building it?

3.  ioMemory SDK

4.  KV-API

5.  Direct FS

6.  Memory Access Semantics

7.  Where are we headed?

April 17, 2013 45

Page 46: NoSQL Munich 2013nosqlroadshow.com/dl/NoSQL-Munich-2013/fusion-io munich... · 2013-04-17 · Topics – NoSQL Munich 2013 1. What are we building ? 2. Why are we building it? 3

API Specs posted at developer.fusionio.com

▸  Early-access to ioMemory SDK API specs and technical documentation (limited enrollment during early-access phase) http://developer.fusionio.com

▸ 

•  Write less code to create high-performing apps

•  Tap into performance not available with conventional I/O access to SSDs

•  Reduce operating costs by decreasing RAM while increasing NVM

April 17, 2013 46

Direct-access to NVM is for developers whose software retrieves and stores data.

Page 47: NoSQL Munich 2013nosqlroadshow.com/dl/NoSQL-Munich-2013/fusion-io munich... · 2013-04-17 · Topics – NoSQL Munich 2013 1. What are we building ? 2. Why are we building it? 3

Open Interfaces and Open Source •  NVM Primitives: Open Interface

•  directFS: Open Source, POSIX Interface

•  NVM API Libraries: Open Source, Open Interface

•  INCITS SCSI (T10) active standards proposals:

▸  SBC-4 SPC-5 Atomic-Write http://www.t10.org/cgi-bin/ac.pl?t=d&f=11-229r6.pdf

▸  SBC-4 SPC-5 Scattered writes, optionally atomic http://www.t10.org/cgi-bin/ac.pl?t=d&f=12-086r3.pdf

▸  SBC-4 SPC-5 Gathered reads, optionally atomic http://www.t10.org/cgi-bin/ac.pl?t=d&f=12-087r3.pdf

•  SNIA NVM-Programming TWG active member

47 April 17, 2013

Page 48: NoSQL Munich 2013nosqlroadshow.com/dl/NoSQL-Munich-2013/fusion-io munich... · 2013-04-17 · Topics – NoSQL Munich 2013 1. What are we building ? 2. Why are we building it? 3

Catalyst for top industry players to Accelerate pursuit of NVM programming

April 17, 2013 48

Page 49: NoSQL Munich 2013nosqlroadshow.com/dl/NoSQL-Munich-2013/fusion-io munich... · 2013-04-17 · Topics – NoSQL Munich 2013 1. What are we building ? 2. Why are we building it? 3

…And Resonating through the Industry

April 17, 2013 49

Page 50: NoSQL Munich 2013nosqlroadshow.com/dl/NoSQL-Munich-2013/fusion-io munich... · 2013-04-17 · Topics – NoSQL Munich 2013 1. What are we building ? 2. Why are we building it? 3

Comprehensive Customer Success

April 17, 2013 50 30+ case studies at http://fusionio.com/casestudies

F I N A N C I A L S M A N U F A C T U R I N G / G O V E R N M E N T

W E B T E C H N O L O G Y R E T A I L

FASTER DATA WAREHOUSE QUERIES 40x QUERY

PROCESSING THROUGHPUT 15x

The world’s leading Q&A site

®

FASTER DATABASE REPLICATION 30x FASTER DATA

ANALYSIS 5x FASTER QUERIES 15x

Page 51: NoSQL Munich 2013nosqlroadshow.com/dl/NoSQL-Munich-2013/fusion-io munich... · 2013-04-17 · Topics – NoSQL Munich 2013 1. What are we building ? 2. Why are we building it? 3

f u s i o n i o . c o m | R E D E F I N E W H A T ’ S P O S S I B L E

T H A N K Y O U !