Upload
merry-wilcox
View
213
Download
0
Embed Size (px)
Citation preview
Fast File System
2/17/2006
Introduction
• Paper talked about changes to old BSD 4.2 File System (FS)
• Motivation - Applications require greater throughput– Large amounts of paging– Want to retain the good of the old FS
• Abstraction• Backward compatibility with existing software
Old File System
• Block size change from 512 KB to 1024 performance > x2 – WHY?– Disk access had 2x more data– Direct blocks had twice the data so indirect blocks
weren’t needed as much• Performance degradation over time
– 175kb/s 30kb/s due to randomization of block placement on disk
• Fundamental limits:– Small block size– read-ahead in the system– Large seek numbers limits file system throughput.
New File System (FS)
• Drives mapped into partitions
• Each partition has a FS described by redundant superblock
• File system blocks = 4096 bytes
• Cyllinder groups – what’s that?
Cyllinder Groups
• a collection of cylinder groups; Each cylinder group has the following components:
• a backup copy of the superblock • a cylinder group header, with statistics, free lists,
etc, about this cylinder group, similar to those in the superblock
• a number of inodes, each containing file attributes
• a number of data blocks
Storage Utilization
• 4x bigger blocks sizes 4096 bytes throughput
• Problem: unix FS commonly composed of small files wasted space
Utilization Cot’d
• Small files stored in more efficient way– Blocks broken into fragments of 512 bytes– Block map associated with each cylinder
group records the space available in a cylinder group at the fragment level
– Is a block is available? look at aligned fragments
Utilization Cot’d
• Fragments of adjoining blocks cannot be used as a full block, even if they are large enough.
• If no block with enough aligned fragments is available at file creation, a full size block is split yielding the necessary fragments and a single unused fragment.
Utilization
• One of three conditions for file growth allocation– Enough space in allocated block data written to
space– Files contain no fragments– Files contain fragments
• Problem: file growth one fragment at a time many data copies
• Soln: User programs write one block at a time
Utilization
• Capacity– Problem: as unallocated space 0, the
throughput falls to 50%– System should keep ~10% unallocated space– Soln: at a threshold, only administrator can
write new blocks
Parameterization of HW
• Why?– Old FS had no information about physical
characteristics of storage device– Blocks allocated optimally
• Processor speed• HW support for mass transfers• Characteristics of mass storage devices (#
platters, physical data layouts, etc)
Parameterization of HW
• Physical characteristics of each disk:– number of blocks per track– rate of disk spin
• Cylinder group summary information: – Cost of rotationally optimal blocks is not free– Soln: count of the available blocks in a
cylinder group at different rotational positions.
• FS can be parameterized to support min. processor disk operation schedule
Layout Policies
• Two parts to data layout policies:– top level -- global policies use FS-wide
summary information to make decisions regarding the placement of inodes and blocks
– Lower level -- local allocation routines use a locally optimal scheme to lay out data blocks.
• Global policies try to balance conflict:– localizing data that is concurrently accessed– spreading out unrelated files
Layout Policies Cot’d
– Two allocatable resources• Inodes• Blocks
– Layout policy tries to place all the inodes of files in a directory in the same cylinder group.
– Data blocks usually accessed together layout policy tries to place all data blocks for a file in the same cylinder group, preferably at rotationally optimal positions in the same cylinder.
Performance
• % of bandwidth in Table 2 measures: effectiveness of utilization of the disk by the file system.
• upper bound on the transfer rate from the disk:– number of bytes on a track x number of revolutions of
the disk per second. • Bandwidth is calculated by comparing the data
rates the file system is able to achieve as a percentage of the bound.
• Results: – the old FS uses 3−5% of the disk bandwidth– new FS uses up to 47% of the bandwidth.
Performance• Some stats:
Performance Continued
• Limits– Processors limit throughput– Memory to memory copying – 40% of I/O time– Block chaining would require driver rewrites– One block allocated at a time – 10% of
system writes
Enhancements
• File system changes and required downtime allowed for some new ideas– Long File Names– File Locking– Symbolic Links– Rename– Quotas
Summary
• File System Changes from Old System– Block size increased– Layout more efficient– Fragments used to reduce space waste
• Performance increased
• New items implemented into FS that had been requested by users
Summary Cot’d
• New FS organization– Utilization– Parameterization– New Layout Policies
• Performance Improvements
Key Ideas
• Old FS used fraction of the available data throughput
• New FS:– same data structures– Same FS semantics
• New FS has new functionality