30
Advanced Operating Advanced Operating Systems Systems Prof. Muhammad Saeed File File Systems-II Systems-II

Advanced Operating Systems Prof. Muhammad Saeed File Systems-II

Embed Size (px)

Citation preview

Page 1: Advanced Operating Systems Prof. Muhammad Saeed File Systems-II

Advanced Operating Advanced Operating SystemsSystems

Prof. Muhammad Saeed

File Systems-IIFile Systems-II

Page 2: Advanced Operating Systems Prof. Muhammad Saeed File Systems-II

Advanced Operating Systems 2

Managing free space: bit vectorManaging free space: bit vectorKeep a bit vector, with one entry per file block

Number bits from 0 through n-1, where n is the number of file blocks on the diskIf bit[j] == 0, block j is freeIf bit[j] == 1, block j is in use by a file (for data or index)

If words are 32 bits long, calculate appropriate bit by:wordnum = block / 32;bitnum = block % 32;

Search for free blocks by looking for words with bits unset (words != 0xffffffff)Easy to find consecutive blocks for a single fileBit map must be stored on disk, and consumes space

Assume 4 KB blocks, 8 GB disk => 2M blocks2M bits = 221 bits = 218 bytes = 256KB overhead

Page 3: Advanced Operating Systems Prof. Muhammad Saeed File Systems-II

Advanced Operating Systems 3

Managing free space: linked listManaging free space: linked list

Use a linked list to manage free blocksSimilar to linked list for file allocationNo wasted space for bitmapNo need for random access unless we want to find consecutive blocks for a single file

Difficult to know how many blocks are free unless it’s tracked elsewhere in the file systemDifficult to group nearby blocks together if they’re freed at different times

Less efficient allocation of blocks to filesFiles read & written more because consecutive blocks not nearby

Page 4: Advanced Operating Systems Prof. Muhammad Saeed File Systems-II

Advanced Operating Systems 4

Issues with free space managementIssues with free space managementOS must protect data structures used for free space managementOS must keep in-memory and on-disk structures consistent

Update free list when block is removed: change a pointer in the previous block in the free listUpdate bit map when block is allocated• Caution: on-disk map must never indicate that a block is free

when it’s part of a file• Solution: set bit[j] in free map to 1 on disk before using

block[j] in a file and setting bit[j] to 1 in memory• New problem: OS crash may leave bit[j] == 1 when block isn’t

actually used in a file• New solution: OS checks the file system when it boots up…

Managing free space is a big source of slowdown in file systems

Page 5: Advanced Operating Systems Prof. Muhammad Saeed File Systems-II

Advanced Operating Systems 5

What’s in a directory?What’s in a directory?Two types of information

File namesFile metadata (size, timestamps, etc.)

Basic choices for directory informationStore all information in directory• Fixed size entries• Disk addresses and attributes in directory entry

Store names & pointers to index nodes (i-nodes)

games attributesmail attributesnews attributes

research attributes

gamesmailnews

research

attributes

attributes

attributes

attributesStoring all informationin the directory

Using pointers toindex nodes

Page 6: Advanced Operating Systems Prof. Muhammad Saeed File Systems-II

Advanced Operating Systems 6

Directory structureDirectory structureStructure

Linear list of files (often itself stored in a file)• Simple to program• Slow to run• Increase speed by keeping it sorted (insertions are slower!)

Hash table: name hashed and looked up in file• Decreases search time: no linear searches!• May be difficult to expand• Can result in collisions (two files hash to same location)

Tree• Fast for searching• Easy to expand• Difficult to do in on-disk directory

Name lengthFixed: easy to programVariable: more flexible, better for users

Page 7: Advanced Operating Systems Prof. Muhammad Saeed File Systems-II

Advanced Operating Systems 7

Handling long file names in a directoryHandling long file names in a directory

Page 8: Advanced Operating Systems Prof. Muhammad Saeed File Systems-II

Advanced Operating Systems 8

Sharing filesSharing filesRoot

directory

Afoo

????

Bfoo

A B C

Cbar

Cfoo

Cblah

APapers

APhotos

AFamily

Asunset

Asunset

Aos.tex

Akids

BPhotos

Blake

Page 9: Advanced Operating Systems Prof. Muhammad Saeed File Systems-II

Advanced Operating Systems 9

Solution: use linksSolution: use linksA creates a file, and inserts into her directoryB shares the file by creating a link to itA unlinks the file

B still links to the fileOwner is still A (unless B explicitly changes it)

a.tex

Owner: ACount: 1

a.tex

Owner: ACount: 2

b.tex

Owner: ACount: 1

b.tex

A A B B

Page 10: Advanced Operating Systems Prof. Muhammad Saeed File Systems-II

Advanced Operating Systems 10

Managing disk spaceManaging disk space

Dark line (left hand scale) gives data rate of a diskDotted line (right hand scale) gives disk space efficiencyAll files 2KB

Page 11: Advanced Operating Systems Prof. Muhammad Saeed File Systems-II

Advanced Operating Systems 11

Disk quotasDisk quotas

Page 12: Advanced Operating Systems Prof. Muhammad Saeed File Systems-II

Advanced Operating Systems 12

The hard block limit is the absolute maximum amount of disk space that a user or group can use. Once this limit is reached, no further disk space can be used.

The soft block limit defines the maximum amount of disk space that can be used. However, unlike the hard limit, the soft limit can be exceeded for a certain amount of time. That time is known as the grace period. The grace period can be expressed in seconds, minutes, hours, days, weeks, or months. A warning is given if soft quota exceeds the limit.

Hard and Soft LimitsHard and Soft Limits

Page 13: Advanced Operating Systems Prof. Muhammad Saeed File Systems-II

Advanced Operating Systems 13

Backing up a file systemBacking up a file systemA file system to be dumped

Squares are directories, circles are filesShaded items, modified since last dumpEach directory & file labeled by i-node number

Page 14: Advanced Operating Systems Prof. Muhammad Saeed File Systems-II

Advanced Operating Systems 14

Bitmaps used in a file system dumpBitmaps used in a file system dump

Page 15: Advanced Operating Systems Prof. Muhammad Saeed File Systems-II

Advanced Operating Systems 15

Checking the file system for consistencyChecking the file system for consistency

Page 16: Advanced Operating Systems Prof. Muhammad Saeed File Systems-II

Advanced Operating Systems 16

File system cacheFile system cache

Many files are used repeatedlyOption: read it each time from diskBetter: keep a copy in memory

File system cacheSet of recently used file blocksKeep blocks just referencedThrow out old, unused blocks• Same kinds of algorithms as for virtual memory• More effort per reference is OK: file references are a lot less

frequent than memory references

Goal: eliminate as many disk accesses as possible!Repeated reads & writesFiles deleted before they’re ever written to disk

Page 17: Advanced Operating Systems Prof. Muhammad Saeed File Systems-II

Advanced Operating Systems 17

File block cache data structuresFile block cache data structures

Page 18: Advanced Operating Systems Prof. Muhammad Saeed File Systems-II

Advanced Operating Systems 18

Grouping data on diskGrouping data on disk

Page 19: Advanced Operating Systems Prof. Muhammad Saeed File Systems-II

Advanced Operating Systems 19

The basic idea is to structure the entire disk as a log.

All writes are initially buffered in memory, and periodically all the buffered writes are written to the disk in a single segment, at the end of the log. Opening a file now consists of using the map tolocate the i-node for the file. Once the i-node has been located, the addresses of the blocks can be found from it. All of the blocks will themselves be in segments, somewhere in the log.

Log-structured file systemsLog-structured file systems

Page 20: Advanced Operating Systems Prof. Muhammad Saeed File Systems-II

Advanced Operating Systems 20

Log-structured file systemsLog-structured file systems

Trends in disk & memoryFaster CPUsLarger memories

ResultMore memory -> disk caches can also be largerIncreasing number of read requests can come from cacheThus, most disk accesses will be writes

LFS structures entire disk as a logAll writes initially buffered in memoryPeriodically write these to the end of the disk logWhen file opened, locate i-node, then find blocks

Issue: what happens when blocks are deleted?

Page 21: Advanced Operating Systems Prof. Muhammad Saeed File Systems-II

Advanced Operating Systems 21

While log-structured file systems are an interesting idea, they are not widely used, in part due to their being highly incompatible with existing file systems. Nevertheless, one of the ideas inherent in them, robustness in the face of failure, can be easily applied to more conventional file systems.

The basic idea in journaling file system is to keep a log of what the file system is going to do before it does it, so that if the system crashes before it can do its planned work, upon rebooting the system can look in the log to see what was going on at the time of the crash and finish the job.

Such file systems, called journaling file systems, are actually in use. Microsoft's NTFS file system and the Linux ext3 and ReiserFS file systems use journaling.

Journaling file systemsJournaling file systems

Page 22: Advanced Operating Systems Prof. Muhammad Saeed File Systems-II

Advanced Operating Systems 22

Unix Fast File System indexing schemeUnix Fast File System indexing scheme

••

••

Direct pointers...

inode

data

datadata

datadata

datadata

data

...

...

...

...

dataprotection mode

owner & group

timestamps

size

block count

single indirect

double indirect

triple indirect

••

••

••

••

••

••

••

••

••

link count

Page 23: Advanced Operating Systems Prof. Muhammad Saeed File Systems-II

Advanced Operating Systems 23

More on Unix FFSMore on Unix FFSFirst few block pointers kept in directory

Small files have no extra overhead for index blocksReading & writing small files is very fast!

Indirect structures only allocated if neededFor 4 KB file blocks (common in Unix), max file sizes are:

48 KB in directory (usually 12 direct blocks)1024 * 4 KB = 4 MB of additional file data for single indirect1024 * 1024 * 4 KB = 4 GB of additional file data for double indirect1024 * 1024 * 1024 * 4 KB = 4 TB for triple indirect

Maximum of 5 accesses for any file block on disk1 access to read inode & 1 to read file blockMaximum of 3 accesses to index blocksUsually much fewer (1-2) because inode in memory

Page 24: Advanced Operating Systems Prof. Muhammad Saeed File Systems-II

Advanced Operating Systems 24

Directories in FFSDirectories in FFS

Directories in FFS are just special files

Same basic mechanismsDifferent internal structure

Directory entries containFile nameI-node number

Other Unix file systems have more complex schemes

Not always simple files…

inode numberrecord lengthname length

name

inode numberrecord lengthname length

name

Directory

Page 25: Advanced Operating Systems Prof. Muhammad Saeed File Systems-II

Advanced Operating Systems 25

CD-ROM file systemCD-ROM file system

Page 26: Advanced Operating Systems Prof. Muhammad Saeed File Systems-II

Advanced Operating Systems 26

Directory entry in MS-DOSDirectory entry in MS-DOS

Page 27: Advanced Operating Systems Prof. Muhammad Saeed File Systems-II

Advanced Operating Systems 27

MS-DOS File Allocation TableMS-DOS File Allocation Table

Block size FAT-12 FAT-16 FAT-32

0.5 KB 2 MB

1 KB 4 MB

2 KB 8 MB 128 MB

4 KB 16 MB 256 MB 1 TB

8 KB 512 MB 2 TB

16 KB 1024 MB 2 TB

32 KB 2048 MB 2 TB

Page 28: Advanced Operating Systems Prof. Muhammad Saeed File Systems-II

Advanced Operating Systems 28

Windows 98 directory entry & file nameWindows 98 directory entry & file name

Bytes

Page 29: Advanced Operating Systems Prof. Muhammad Saeed File Systems-II

Advanced Operating Systems 29

Storing a long name in Windows 98Storing a long name in Windows 98

Long name stored in Windows 98 so that it’s backwards compatible with short names

Short name in “real” directory entryLong name in “fake” directory entries: ignored by older systems

OS designers will go to great lengths to make new systems work with older systems…

Page 30: Advanced Operating Systems Prof. Muhammad Saeed File Systems-II

Advanced Operating Systems 30

ENDEND

Courtesy of University of PITTSBURGH