Flash memory File system organisation issues Nick Gaens

Preview:

Citation preview

Flash memoryFile system organisation issues

Nick Gaens

IntroductionTechnologiesHow does it work?LimitationsFile systems: problems and workarounds

Outline

Flash memory is a non-volatile computer storage chip that can be electrically erased and reprogrammed.

– Wikipedia

Quick introduction

Usage: almost everywhere.

Latest trend: SSD’s, successors of HDD’s.

Quick introduction

SSD HDD

Performance High Average

Access time Low Average

Cost ($/GB) High Low

Life expectancy Low Average

Overall level of research activity: quite low.

How come? Extremely low cost-effective due to expensiveness and low life expectancy.

Recent: on the rise (IBM’s nanocrystals, terabyte thumb drives …).

Quick introduction

NAND NOR

Architecture Extremely high cell densities Small amount of unstructured cells

Primary usage Data storage Code storage (eXecute In Place)

Addressability Blocks, Serial I/O interface Bytes, SRAM

Performance Faster write and erase Faster read, very slow erase

Problems 106 cycles, bit flipping 105 cycles, bit flipping

Cost Lower price tag Higher price tag

Technologies

SLC MLC

Storage Single bit per cell Two bits per cell

Cost Higher cost per bit Lower cost per bit

Endurance 105 cycles 104 cycles

Application Industrial environments Commercial products

Technologies

NAND chip consists of blocks consisting of pages.

Block: smallest unit of erase operationPage: smallest unit of read / write operation

How does it work?

Each page has one of the next statuses: “alive” (contains new, valid data),

“dead” (contains old data) or “free” (can be written to).

How does it work?

Data is written on each page once, thus no rewrite of data in the same location.

So updating data requires:1. find a new page, write data in it and mark

it “alive”;2. mark the previous page as “dead”.

Problem: data wasn’t actually erased, so free space is worn out.

How does it work?

Garbage collector converts “dead” pages to “free” ones.

So erasing data requires:1. read all “alive” pages of a block;2. write them all to an empty block;3. delete the contents of the entire

block of 1. and mark it as “free”.

How does it work?

A block can endure a limited amount (106) of erase cycles before becoming unusable.

How to expand the lifetime of flash drives?

Introduction of a wear-leveling policy which spreads out erase operations on all blocks of the memory.

Limitations

The erase operation is very slow, due to the composition of three required steps.

How slow? 5 times slower than reading, 2 times slower than writing.

Impact on flash database design: effect on usage of tree structures (e.g. B+-Tree’s).

Limitations

Traditional file systems (such as NTFS, FAT(32), HFS(+), UDF and ext2/3/4) are most frequently found to be used with disk based data storage devices (HDD’s, DVD’s).

Using these FS’s on flash based storage devices is quite opportunistic and cheap, though naïve, minimizing performance gains and lowering flash memory’s lifetime.

File systems

How come?

Erase operation of flash memory is explicit and expensive, thus better scheduled when idling.

(Disks don’t require such scheduling at all.)

File systems

How come?

Flash memory devices impose no seek latency, thus randomly accessing memory locations doesn’t cause a performance disaster.Disk file systems are however optimized to avoid disk seeks whenever possible, due to the high cost of seeking on disk based devices.

File systems

How come?

Flash memory devices tend to wear out when a single block is repeatedly overwritten.

Wear-leveling: a necessity.

(Flash file systems are designed to spread out writes evenly.)

File systems

Adapt the existing FS’s by adding a layer on top of them, the Flash Translation Layer. This layer takes care of the introduced constraints and restrictions of flash memory.

Workarounds

Log-structured File System

Conventional file systems: great care for spatial locality and in-place changes to their data structures (due to slow seeking of magnetic disks).

Hypothesis: an ever-increasing amount of system memory makes the above obsolete.

Workarounds

Log-structured File System

A lot of available system memory would lead to I/O becoming extremely write-heavy.

(Reads can be done from memory cache.)

How to exploit this (hypothetical) situation?

Workarounds

Log-structured File System

Treating storage as a circular log and writing sequentially to the head of that log to maximize the write throughput.

(Positive side effects of this technique are snapshotting, improved crash recovery and tampering the GC by divide and conquer.)

Workarounds

Workarounds remain what they are … just workarounds.

A native flash file system can by-design provide an environment in which the performance isn’t limited by any ‘extra’.

(Examples are JFFS(2), YAFFS, TrueFFS and ExtremeFFS.)

Workarounds

However, in practice, flash file systems are only used for "Memory Technology Devices“.

MTD’s are embedded flash memories that do not have a controller which takes care of the FTL or any other workarounds.

Most commercial flash memories do have such a controller. (E.g. SD, SSD)

Flash file system

These controllers remain to offer increasing levels of performance, causing the call for applying a native flash file systems to be silenced.

Also, benchmarks that directly compare flash FS’s to traditional ones cannot be done that easily.

Flash file system

Flash memory provides new levels of raw performance to storage techniques, although they do have some issues / caveats.

Increasing affordability and feasibility of consumer-leveled flash-based mass storage devices.

Consequence: ‘naked’ file systems are quite dumb when it comes to interfacing with flash memory.

Conclusions

Solution (?) by providing all sorts of high-performance workarounds that take care of the issues mentioned before.

Native flash file systems don’t need such workarounds at all, making them attractive.

In practice, those flash FS’s are of little use, due to their requirement of the absence of e.g. controllers.

Conclusions

How many of you do own an SSD?Are you aware of the limited lifetime expectancy of such devices?

Discussion

Co-presentation: need of advanced data structures for allowing Game AI algorithms to perform faster on e.g. range queries for large amounts of NPC’s.

Underlaying cause of this need is the lack of high-performance mass data storage.

Does the uprise of flash memory make this research obsolete?

Discussion

Recommended