20
1 Persistent Memory and Media Errors Vishal Verma [email protected] Vault 2016

Persistent Memory and Media Errors · Persistent Memory and Media Errors ... – The 'fix' has to be triggered by the user/app. 9 ... • Convert SPA poison ranges to bad disk sectors

  • Upload
    vanhanh

  • View
    239

  • Download
    0

Embed Size (px)

Citation preview

Page 1: Persistent Memory and Media Errors · Persistent Memory and Media Errors ... – The 'fix' has to be triggered by the user/app. 9 ... • Convert SPA poison ranges to bad disk sectors

1

Persistent Memory and Media Errors

Vishal Verma

[email protected]

Vault 2016

Page 2: Persistent Memory and Media Errors · Persistent Memory and Media Errors ... – The 'fix' has to be triggered by the user/app. 9 ... • Convert SPA poison ranges to bad disk sectors

2

…Or

How to have your Poison and (not) consume it too

Page 3: Persistent Memory and Media Errors · Persistent Memory and Media Errors ... – The 'fix' has to be triggered by the user/app. 9 ... • Convert SPA poison ranges to bad disk sectors

3

NVDIMM software stack

NVDIMM

DAXDAXRegular Block IORegular Block IO

UserSpace

KernelSpace

StandardFile API

Libnvdimm DriversLibnvdimm Drivers

ApplicationApplication

File SystemFile System

Application ApplicationApplicationApplication

StandardRaw Device

Access

Load/StoreStandardFile API

Persistent Memory Aware File SystemPersistent Memory Aware File System

MMUMappings

Cache Line I/O

Page 4: Persistent Memory and Media Errors · Persistent Memory and Media Errors ... – The 'fix' has to be triggered by the user/app. 9 ... • Convert SPA poison ranges to bad disk sectors

4

What is poison

Persistent Memory == Persistent Poison

What we (storage people) would like

What needs to be done in Linux

How we did it

Page 5: Persistent Memory and Media Errors · Persistent Memory and Media Errors ... – The 'fix' has to be triggered by the user/app. 9 ... • Convert SPA poison ranges to bad disk sectors

5

What is Poison

Page 6: Persistent Memory and Media Errors · Persistent Memory and Media Errors ... – The 'fix' has to be triggered by the user/app. 9 ... • Convert SPA poison ranges to bad disk sectors

6

What is Poison• Bad cell in memory

– Transient or hard/uncorrectable error

• How platforms deal with it

– Machine Check Exception

● Recoverable on high-RAS platforms (page is sequestered, app get SIGBUS)

● OS crash on other platforms

– If transient, rebooting typically makes the poison page go away

– If permanently degraded cell, replace DIMM

Page 7: Persistent Memory and Media Errors · Persistent Memory and Media Errors ... – The 'fix' has to be triggered by the user/app. 9 ... • Convert SPA poison ranges to bad disk sectors

7

What is poison

Persistent Memory == Persistent Poison

What we (storage people) would like

What needs to be done in Linux

How we did it

Page 8: Persistent Memory and Media Errors · Persistent Memory and Media Errors ... – The 'fix' has to be triggered by the user/app. 9 ... • Convert SPA poison ranges to bad disk sectors

8

What is Persistent Poison• Bad cell in Persistent memory

– Will not go away on reboot

– Without any changes in Linux:

● Trying to write to it will trigger a machine check

● Deleting the file won't help either

– A bad cache line is now a bad filesystem block

– * Implies data has been lost *

● Drivers/firmware can easily 'fix' the bad location, but it is imperative to let userspace know of the data loss.

– The 'fix' has to be triggered by the user/app

Page 9: Persistent Memory and Media Errors · Persistent Memory and Media Errors ... – The 'fix' has to be triggered by the user/app. 9 ... • Convert SPA poison ranges to bad disk sectors

9

NVDIMM

DAXDAX

UserSpace

KernelSpaceLibnvdimm driverLibnvdimm driver

Application Application

Load/StoreStandardFile API

Persistent Memory Aware File SystemPersistent Memory Aware File System

MMUMappings

Cache Line I/O

mcheck handler

• Unmap• Notify• Crash

• Behavior we want to prevent

– Application calls read()

– Hits poison

– Crashes/gets SIGBUS

– (reboots)

– App starts up

– Tries to access its data (read())

– Crash

– …

– “Reebootus-infinitus”

What is Persistent Poison

Page 10: Persistent Memory and Media Errors · Persistent Memory and Media Errors ... – The 'fix' has to be triggered by the user/app. 9 ... • Convert SPA poison ranges to bad disk sectors

10

What is poison

Persistent Memory == Persistent Poison

What we (storage people) would like

What needs to be done in Linux

How we did it

Page 11: Persistent Memory and Media Errors · Persistent Memory and Media Errors ... – The 'fix' has to be triggered by the user/app. 9 ... • Convert SPA poison ranges to bad disk sectors

11

What we would like

• Instead of a SIGBUS/crash, return -EIO

• Way to expose known poison to Software

• Ability for Software to clear poison

Page 12: Persistent Memory and Media Errors · Persistent Memory and Media Errors ... – The 'fix' has to be triggered by the user/app. 9 ... • Convert SPA poison ranges to bad disk sectors

12

What is poison

Persistent Memory == Persistent Poison

What we (storage people) would like

What needs to be done in Linux

How we did it

Page 13: Persistent Memory and Media Errors · Persistent Memory and Media Errors ... – The 'fix' has to be triggered by the user/app. 9 ... • Convert SPA poison ranges to bad disk sectors

13

What needs to be done: Exposing poison

DAXDAX

UserSpace

KernelSpaceLibnvdimmLibnvdimm

Application Application

Load/Store

Persistent Memory Aware File SystemPersistent Memory Aware File System

MMUMappings

NVDIMM

FirmwareFirmware

ARS

Expose to FS

Expose to App

• Start an Address Range Scrub (ARS)– Or harvest results from a

previous, automatically started scrub

• Libnvdimm gets a list of poison

• Make it available for libnvdimm drivers

and file systems to check on I/Os

• Expose it to userspace

Page 14: Persistent Memory and Media Errors · Persistent Memory and Media Errors ... – The 'fix' has to be triggered by the user/app. 9 ... • Convert SPA poison ranges to bad disk sectors

14

What needs to be done: Clearing poison

DAXDAX

UserSpace

KernelSpaceLibnvdimmLibnvdimm

Application Application

Load/Store

Persistent Memory Aware File SystemPersistent Memory Aware File System

MMUMappings

NVDIMM

Firmware (handles _DSM)Firmware (handles _DSM)

Clear Poison

Provide new data

Provide new data

• App provides new data

• Filesystem detects write to a poison

range and goes through driver

• Driver calls the clear_poison DSM, and

then writes data

Page 15: Persistent Memory and Media Errors · Persistent Memory and Media Errors ... – The 'fix' has to be triggered by the user/app. 9 ... • Convert SPA poison ranges to bad disk sectors

15

What is poison

Persistent Memory == Persistent Poison

What we (storage people) would like

What needs to be done in Linux

How we did it

Page 16: Persistent Memory and Media Errors · Persistent Memory and Media Errors ... – The 'fix' has to be triggered by the user/app. 9 ... • Convert SPA poison ranges to bad disk sectors

16

How we did it: Exposing poison

LibnvdimmLibnvdimm

NVDIMM

FirmwareFirmware

ARS

Expose to FS Expose to App

• Harvest results from an Address Range Scrub– Get SPA relative list of poison

• Convert SPA poison ranges to bad disk sectors

• Make md-raid's badblocks code generic

• Add bad blocks to gendisk

• Also expose them in sysfs

/dev/pmem0 /dev/pmem1

gendisk->badblocks$ cat /sys/block/pmem1/badblocks

1024 1

Page 17: Persistent Memory and Media Errors · Persistent Memory and Media Errors ... – The 'fix' has to be triggered by the user/app. 9 ... • Convert SPA poison ranges to bad disk sectors

17

How we did it: Handling Driver I/O

• In the pmem driver, check if a BIO is for a bad sector

• If reading:– fail with an -EIO

• If writing:– Send clear_poison DSM

– Clear the sector from gendisk->badblocks

– Write the new data

NVDIMM

Regular Block IORegular Block IO

StandardFile API

Libnvdimm DriversLibnvdimm Drivers

ApplicationApplication

File SystemFile System

ApplicationApplication

StandardRaw Device

Access

READ:Check disk->badblocksReturn -EIO

Write:Check disk->badblocksSend DSM: c lear_poisonClear disk badblocks→Write data

UserSpace

KernelSpace

Page 18: Persistent Memory and Media Errors · Persistent Memory and Media Errors ... – The 'fix' has to be triggered by the user/app. 9 ... • Convert SPA poison ranges to bad disk sectors

18

How we did it: Handling DAX I/O

• ->direct_access() checks for badblocks at

fault time

• If found, DAX mapping fails

• If writing:– Fallback to blockdev_do_direct_IO()

• All zeroing goes through the driver

• If we hit a latent error, SIGBUS/crash, and it will

be a 'known' error the next time NVDIMM

DAXDAX

UserSpace

KernelSpace

LibnvdimmLibnvdimm

Application Application

Load/StoreStandardFile API

Persistent Memory Aware File SystemPersistent Memory Aware File System

MMUMappings

Cache Line I/O

dax_map_atomicFails with -EIO

XX

Page 19: Persistent Memory and Media Errors · Persistent Memory and Media Errors ... – The 'fix' has to be triggered by the user/app. 9 ... • Convert SPA poison ranges to bad disk sectors

19

DAXDAXHow we did it: 'Blast Radius'

• The poison is on a cache line granularity

• Block layer rounds up to a sector (512B)

• Fs/DAX will round up to a page (4K)

• If an app hits a bad page:– look up bad sector from sysfs

– do a write() to clear it

UserSpace

KernelSpace

LibnvdimmLibnvdimm

pmem-AwareFile System

pmem-AwareFile System

MMUMappings

Bad Blocks

4k

4k

512

NVDIMM 64

Machine Check HandlerMachine Check Handler

Page 20: Persistent Memory and Media Errors · Persistent Memory and Media Errors ... – The 'fix' has to be triggered by the user/app. 9 ... • Convert SPA poison ranges to bad disk sectors

Q & A