19
Linux Block IO: Introducing Multiqueue SSD Access on Multicore Systems Matias Bjørling 1,2 , Jens Axboe 2 , David Nellans 2 , Philippe Bonnet 1 2013/03/05 1 IT University of Copenhagen - Fusion-io 2 1

Linux Block IO: Introducing Multiqueue SSD Access on Multicore Systems

  • Upload
    sandisk

  • View
    3.016

  • Download
    2

Embed Size (px)

DESCRIPTION

On Tuesday, July 2 2013, Jens Axboe is presenting “Linux Block I/O: Introducing Multi-Queue SSD access on Multi-Core Systems.” Jens, Fusion-io Fellow and lead developer of the Linux file system, will discuss achieving maximum and scalable I/O throughput for high-performance flash devices in the Linux I/O stack. The paper presents a new queuing model for storage devices that is both scalable and provides the feature set support any modern flash-based device and driver needs. The new architecture achieves near linear scaling on even complex machine topologies. Work is already in progress to convert existing Linux block drivers and SCSI low-level drivers to this model. It is expected to hit the mainline Linux kernels in version 3.11 or 3.12. Authors of the paper include Matias Bjørling (IT University of Copenhagen), Jens Axboe and David Nellans (Fusion-io), and Philippe Bonnet (IT University of Copenhagen).

Citation preview

Page 1: Linux Block IO: Introducing Multiqueue SSD Access on Multicore Systems

Linux Block IO: Introducing Multiqueue SSD Access on

Multicore Systems

Matias Bjørling1,2, Jens Axboe2, David Nellans2, Philippe Bonnet1

2013/03/05

1IT University of Copenhagen - Fusion-io2

1

Page 2: Linux Block IO: Introducing Multiqueue SSD Access on Multicore Systems

Non-Volatile Memory Devices • Higher IOPS and lower

latency than magnetic disk • Hardware continues to

improve – Parallel architecture – Future NVM technologies

• Software design decisions – Fast sequential access – Slow random access

2

2010 2011 2012 2013

?

Page 3: Linux Block IO: Introducing Multiqueue SSD Access on Multicore Systems

The Block Layer in the Linux Kernel

• Provides – Single point of access for block devices – IO scheduling – Merging & reordering – Accounting

• Optimized for traditional hard-drives – Trade CPU cycles for sequential disk accesses – Synchronization is a performance bottleneck

3

Page 4: Linux Block IO: Introducing Multiqueue SSD Access on Multicore Systems

NVM device

In-Kernel IO Submission / Completion

Process App/FS/

etc.

Device driver

Kernel

Low-level block IO

device driver Block IO layer

Request Queue bio

Scheduling Accounting

Reordering Merging

4

Page 5: Linux Block IO: Introducing Multiqueue SSD Access on Multicore Systems

Block Layer IO for Disks

• Single request queue – Single lock data structure – Cache-coherence expensive on multi-CPU systems

5

Page 6: Linux Block IO: Introducing Multiqueue SSD Access on Multicore Systems

Methodology

• Null block device driver – In kernel dummy device – Acts as proxy for a no latency device – Avoids the DMA overhead – Removes driver effects from block layer – ”Device”  completions  can  be  in-line or soft irq

6

NVM device

Process App/FS/

etc. Null driver Block IO layer

Page 7: Linux Block IO: Introducing Multiqueue SSD Access on Multicore Systems

Current Block Layer (IOPS)

Null block device, noop IO scheduler, 512B random reads

Westmere-EP

Nehalem-EX Westmere-EX

7

Sandy-bridge E

Page 8: Linux Block IO: Introducing Multiqueue SSD Access on Multicore Systems

Multi-Queue Block Layer

• Balance IO workload across cores • Reduce cache-line sharing • Similar functionality as SQ • Allow multiple hardware queues

Device driver

OS Block Layer

libaio

Socket 1

Process

Device driver

OS Block Layer

libaio

Socket 1

Process

Device driver

OS Block Layer

libaio

Socket 2

Process

Page 9: Linux Block IO: Introducing Multiqueue SSD Access on Multicore Systems

Multi-Queue Block Layer

• Two levels of queues – Software Queues – Hardware Dispatch

Queues • Per CPU accounting • Tag-based completions

9

Userspace

Kernel

Single/Multi-queue capable hardware device

Process Process

Block Layer

...

...

SoftwareQueues

Hardware Dispatch Queues

Staging (Insertion/Merge)

Submission / Completion

Fairness / IO Scheduling

Tagging

Accounting

Submit IO

Status / Completion interrupt

Block IO device driver

Async/Sync

Page 10: Linux Block IO: Introducing Multiqueue SSD Access on Multicore Systems

Mapping Strategies

• Per-core software queues • Varying hardware

dispatch queues

• Single software queue • Varying hardware

dispatch queues

10 Scalability = Multiple hardware dispatch queues

MQ Scalability

MQ Scalability

Page 11: Linux Block IO: Introducing Multiqueue SSD Access on Multicore Systems

Multi-Queue IOPS

11

SQ – Current block layer, Raw – Block layer bypass, MQ – Proposed design

Page 12: Linux Block IO: Introducing Multiqueue SSD Access on Multicore Systems

Multi-socket Scaling

12

Page 13: Linux Block IO: Introducing Multiqueue SSD Access on Multicore Systems

AIO interface

• Alternative interface to synchronous • Prevents blocking of user threads • Move bio list to a lockless list • Remove legacy code and locking

13

Device driver

OS Block Layer

libaio

Socket 1

Process

Device driver

OS Block Layer

libaio

Socket 1

Process

Device driver

OS Block Layer

libaio

Socket 2

Process

Page 14: Linux Block IO: Introducing Multiqueue SSD Access on Multicore Systems

Multi-Queue, 8 socket

14

Page 15: Linux Block IO: Introducing Multiqueue SSD Access on Multicore Systems

Latency Comparison

• Sync IO, 512B • Queue depth grows with CPU count

1,5 0,8

1,2

4,6

1,2 1,9

0

1

2

3

4

5

SQ Raw MQ

Late

ncy

(us)

1 core

6 cores

1,9 1 1,4

15,8

2,2 3,8

29,4

2,1 5,4

05

101520253035

SQ Raw MQ

Late

ncy

(us)

1 core

7 cores

12 cores

15

1 socket

2 socket 5.4x

2.3x

Page 16: Linux Block IO: Introducing Multiqueue SSD Access on Multicore Systems

Latency Comparison

2,7 1,5 2 37

5,1 10

251

34

84

0

50

100

150

200

250

300

SQ Raw MQ

Late

ncy

(us)

1 core

9 cores

32 cores

16

4 socket

8 socket

2.9x

2,9 1,6 2,3

386

7,9 10,5

1200

10 153

SQ Raw MQ 0

200

400

600

800

1000

1200

1400

Late

ncy

(us)

1 core

11 cores

80 cores

7.8x

Page 17: Linux Block IO: Introducing Multiqueue SSD Access on Multicore Systems

Conclusions

• Current block layer is inadequate for NVM • Proposed Multi-Queue block layer

– 3.5x-10x increase in IOPS – 10x-38x reduction in latency – Supports multiple hardware queues – Simplifies driver development – Maintains industry standard interfaces

17

Page 18: Linux Block IO: Introducing Multiqueue SSD Access on Multicore Systems

Thank You & Questions

Implementation available at git://git.kernel.dk/?p=linux-block.git

18

Page 19: Linux Block IO: Introducing Multiqueue SSD Access on Multicore Systems

Systems

• 1 Socket – Sandy Bridge-E, i7-3930K, 3.2Ghz, 12MB L3

• 2 Socket – Westmere-EP, X5690, 3.46Ghz, 12MB L3

• 4 Socket – Nehalem-EX, X7560, 2.66Ghz, 24MB L3

• 8 Socket – Westmere-EX, E7-2870, 2.4Ghz, 30MB L3

19