28
Flash memory File system organisation issues Nick Gaens

Flash memory File system organisation issues Nick Gaens

Embed Size (px)

Citation preview

Page 1: Flash memory File system organisation issues Nick Gaens

Flash memoryFile system organisation issues

Nick Gaens

Page 2: Flash memory File system organisation issues Nick Gaens

IntroductionTechnologiesHow does it work?LimitationsFile systems: problems and workarounds

Outline

Page 3: Flash memory File system organisation issues Nick Gaens

Flash memory is a non-volatile computer storage chip that can be electrically erased and reprogrammed.

– Wikipedia

Quick introduction

Page 4: Flash memory File system organisation issues Nick Gaens

Usage: almost everywhere.

Latest trend: SSD’s, successors of HDD’s.

Quick introduction

SSD HDD

Performance High Average

Access time Low Average

Cost ($/GB) High Low

Life expectancy Low Average

Page 5: Flash memory File system organisation issues Nick Gaens

Overall level of research activity: quite low.

How come? Extremely low cost-effective due to expensiveness and low life expectancy.

Recent: on the rise (IBM’s nanocrystals, terabyte thumb drives …).

Quick introduction

Page 6: Flash memory File system organisation issues Nick Gaens

NAND NOR

Architecture Extremely high cell densities Small amount of unstructured cells

Primary usage Data storage Code storage (eXecute In Place)

Addressability Blocks, Serial I/O interface Bytes, SRAM

Performance Faster write and erase Faster read, very slow erase

Problems 106 cycles, bit flipping 105 cycles, bit flipping

Cost Lower price tag Higher price tag

Technologies

Page 7: Flash memory File system organisation issues Nick Gaens

SLC MLC

Storage Single bit per cell Two bits per cell

Cost Higher cost per bit Lower cost per bit

Endurance 105 cycles 104 cycles

Application Industrial environments Commercial products

Technologies

Page 8: Flash memory File system organisation issues Nick Gaens

NAND chip consists of blocks consisting of pages.

Block: smallest unit of erase operationPage: smallest unit of read / write operation

How does it work?

Page 9: Flash memory File system organisation issues Nick Gaens

Each page has one of the next statuses: “alive” (contains new, valid data),

“dead” (contains old data) or “free” (can be written to).

How does it work?

Page 10: Flash memory File system organisation issues Nick Gaens

Data is written on each page once, thus no rewrite of data in the same location.

So updating data requires:1. find a new page, write data in it and mark

it “alive”;2. mark the previous page as “dead”.

Problem: data wasn’t actually erased, so free space is worn out.

How does it work?

Page 11: Flash memory File system organisation issues Nick Gaens

Garbage collector converts “dead” pages to “free” ones.

So erasing data requires:1. read all “alive” pages of a block;2. write them all to an empty block;3. delete the contents of the entire

block of 1. and mark it as “free”.

How does it work?

Page 12: Flash memory File system organisation issues Nick Gaens

A block can endure a limited amount (106) of erase cycles before becoming unusable.

How to expand the lifetime of flash drives?

Introduction of a wear-leveling policy which spreads out erase operations on all blocks of the memory.

Limitations

Page 13: Flash memory File system organisation issues Nick Gaens

The erase operation is very slow, due to the composition of three required steps.

How slow? 5 times slower than reading, 2 times slower than writing.

Impact on flash database design: effect on usage of tree structures (e.g. B+-Tree’s).

Limitations

Page 14: Flash memory File system organisation issues Nick Gaens

Traditional file systems (such as NTFS, FAT(32), HFS(+), UDF and ext2/3/4) are most frequently found to be used with disk based data storage devices (HDD’s, DVD’s).

Using these FS’s on flash based storage devices is quite opportunistic and cheap, though naïve, minimizing performance gains and lowering flash memory’s lifetime.

File systems

Page 15: Flash memory File system organisation issues Nick Gaens

How come?

Erase operation of flash memory is explicit and expensive, thus better scheduled when idling.

(Disks don’t require such scheduling at all.)

File systems

Page 16: Flash memory File system organisation issues Nick Gaens

How come?

Flash memory devices impose no seek latency, thus randomly accessing memory locations doesn’t cause a performance disaster.Disk file systems are however optimized to avoid disk seeks whenever possible, due to the high cost of seeking on disk based devices.

File systems

Page 17: Flash memory File system organisation issues Nick Gaens

How come?

Flash memory devices tend to wear out when a single block is repeatedly overwritten.

Wear-leveling: a necessity.

(Flash file systems are designed to spread out writes evenly.)

File systems

Page 18: Flash memory File system organisation issues Nick Gaens

Adapt the existing FS’s by adding a layer on top of them, the Flash Translation Layer. This layer takes care of the introduced constraints and restrictions of flash memory.

Workarounds

Page 19: Flash memory File system organisation issues Nick Gaens

Log-structured File System

Conventional file systems: great care for spatial locality and in-place changes to their data structures (due to slow seeking of magnetic disks).

Hypothesis: an ever-increasing amount of system memory makes the above obsolete.

Workarounds

Page 20: Flash memory File system organisation issues Nick Gaens

Log-structured File System

A lot of available system memory would lead to I/O becoming extremely write-heavy.

(Reads can be done from memory cache.)

How to exploit this (hypothetical) situation?

Workarounds

Page 21: Flash memory File system organisation issues Nick Gaens

Log-structured File System

Treating storage as a circular log and writing sequentially to the head of that log to maximize the write throughput.

(Positive side effects of this technique are snapshotting, improved crash recovery and tampering the GC by divide and conquer.)

Workarounds

Page 22: Flash memory File system organisation issues Nick Gaens

Workarounds remain what they are … just workarounds.

A native flash file system can by-design provide an environment in which the performance isn’t limited by any ‘extra’.

(Examples are JFFS(2), YAFFS, TrueFFS and ExtremeFFS.)

Workarounds

Page 23: Flash memory File system organisation issues Nick Gaens

However, in practice, flash file systems are only used for "Memory Technology Devices“.

MTD’s are embedded flash memories that do not have a controller which takes care of the FTL or any other workarounds.

Most commercial flash memories do have such a controller. (E.g. SD, SSD)

Flash file system

Page 24: Flash memory File system organisation issues Nick Gaens

These controllers remain to offer increasing levels of performance, causing the call for applying a native flash file systems to be silenced.

Also, benchmarks that directly compare flash FS’s to traditional ones cannot be done that easily.

Flash file system

Page 25: Flash memory File system organisation issues Nick Gaens

Flash memory provides new levels of raw performance to storage techniques, although they do have some issues / caveats.

Increasing affordability and feasibility of consumer-leveled flash-based mass storage devices.

Consequence: ‘naked’ file systems are quite dumb when it comes to interfacing with flash memory.

Conclusions

Page 26: Flash memory File system organisation issues Nick Gaens

Solution (?) by providing all sorts of high-performance workarounds that take care of the issues mentioned before.

Native flash file systems don’t need such workarounds at all, making them attractive.

In practice, those flash FS’s are of little use, due to their requirement of the absence of e.g. controllers.

Conclusions

Page 27: Flash memory File system organisation issues Nick Gaens

How many of you do own an SSD?Are you aware of the limited lifetime expectancy of such devices?

Discussion

Page 28: Flash memory File system organisation issues Nick Gaens

Co-presentation: need of advanced data structures for allowing Game AI algorithms to perform faster on e.g. range queries for large amounts of NPC’s.

Underlaying cause of this need is the lack of high-performance mass data storage.

Does the uprise of flash memory make this research obsolete?

Discussion