Upload
josephine-hardy
View
216
Download
3
Embed Size (px)
Citation preview
CENG 334 – Operating Systems
07- File Systems
Asst. Prof. Yusuf Sahillioğlu
Computer Eng. Dept, , Turkey
File Concept2 / 69
Users Applications/Processes We think and use files when we want to store something Can your code write something to the disk without using
files? NO
File is not a physical thing; it is an abstract entity Actual data we’re writing to a file sits in the storage media
(disk/CD)
OS provides that abstract entity via its File System component How to place files on physical media How to read/write ..
File Concept3 / 69
File concept illustration
Don’t deal with the disk directly Instead deal with the files (logical storage: byte 0 to n)
File Concept4 / 69
OS File System componenet is a software that views the hardware as a sequence of blocks of some size
Those blocks are mapped to sectors of the disk by Disk Driver Given a block number Disk Driver finds the corresponding
sector num
File Concept5 / 69
OS File System componenet views the underneath storage as a sequence of blocks
Applications are viewing the storage as a set of FILES Hardware has the storage as a set of sectors
File Concept6 / 69
OS File System componenet views the underneath storage as a sequence of blocks
Applications are viewing the storage as a set of FILES There has to be a mapping from FILES to blocks
File Concept7 / 69
A file is sitting in some blocks (contiguous or non-contiguous)
If a file content needs to occupy 4 blocks, OS File System Component decides which 4 blocks should contain the content
File Concept8 / 69
We will first see File System Interface (which functions provided)
Then we will see how those functions are implemented File block mapping
So we will understand OS File System Component thoroughly
File Concept9 / 69
File is just a contiguous logical address space (a storage) Actual content may be non-contiguous in the disk OS makes the arrangement so that you view the file as a
contiguous logical space/storage
What can we store in a file? Data
numericcharacterbinary
Program User’s (process) view of a file
File Structure10 / 69
None (no structure at all): sequence of words, bytes Unix, Windows
Simple record structure: sequence of records Lines Fixed length Variable length
Complex Structures Formatted document: understood by Word program, not
OS Executable file: understood by OS
Can simulate last two with first method by inserting appropriate control characters
Who decides: Operating system Program
File Attributes11 / 69
Name – only information kept in human-readable form Identifier – unique tag (number) identifies file within file
system Type – needed for systems that support different types Location – pointer to file location on device Size – current file size Protection – controls who can do reading, writing, executing Time, date, and user identification – data for protection,
security, and usage monitoring
Information about files are kept in the directory structure, which is maintained on the disk, not in memory
File Attributes12 / 69
There are 2 basic things stored on disk as part of the area controlled by the file system Files: storage content Directory info (can be a tree): info about files, attributes,
locationsOrganize files into a directory structure for efficient
accessOne entry per file: filename + pointer to attrbts
File Operations13 / 69
File is an abstract data type: A class with attributes and operations Create Write Read Reposition within file Delete Truncate
These ops are implemented by the File Sys Component of the OS
These ops are used by the application programmer
Open(Fi) – search the directory structure on disk for entry Fi, and move the content of entry to memory
Close (Fi) – move the content of entry Fi in memory to directory structure on disk
File Operations14 / 69
Efficient open-file manangement: open-file table Say a read-operation comes for an open file
Search directory on the disk (slow disk access) to locate entry Fi
Go to that location indicated by the entry
File Operations15 / 69
Efficient open-file manangement: open-file table Say a read-operation comes for an open file
Repeat searching the directory for each read operation: slow
OS maintains an open-file table for efficiency
File Operations16 / 69
Efficient open-file manangement: open-file table Only the open system call searches the directory, locates Fi,
caches it into open-file table, and returns the index of open-file table entry to the process: File Descriptor
File Operations17 / 69
Efficient open-file manangement: open-file table Subsequent operations (read, write, ..) can then use File
Descriptor handle without any further disk access
File Operations18 / 69
Efficient open-file manangement File pointer: pointer to last read/write location, per
process that has the file open File-open count: counter of number of times a file is
open – to allow removal of data from open-file table when last processes closes it
Disk location of the file: cache of data access information
Access rights: per-process access mode information
File Operations19 / 69
Each process has its own file-poisitoin pointer
System-wide open file table: location on disk & open-count same forall
File Operations21 / 69
You can manually start the app that reads the file and load the file in it
Or OS automatically starts the app that reads the file (association)
File Access Methods22 / 69
Sequential Access (fscanf)read nextwrite next resetno read after last write
(rewrite) Direct Access
read nwrite nposition to n
read nextwrite next
rewrite n n = relative block number
File Sharing24 / 69
Sharing of files on multi-user systems is desirable
Sharing may be done through a protection scheme
On distributed systems, files may be shared across a network
Network File System (NFS) is a common distributed file-sharing method
File Sharing25 / 69
User IDs identify users, allowing permissions and protections to be per-user
Group IDs allow users to be in groups, permitting group access rights
Attributes for protection
File Sharing26 / 69
Protection is based on the use of UserIDs and GroupIDs
Each file has associated protection bits (permissions) for userID and groupID userID: read, write, execute? groupID: read, write, execute?
File Sharing Remotely28 / 69
Uses networking to allow file system access between systems Manually via programs like FTP Automatically, seamlessly using distributed file systems Semi automatically via the world wide web
Client-server model allows clients to mount remote file systems from servers Server can serve multiple clients Client and user-on-client identification is insecure or
complicated NFS is standard UNIX client-server file sharing protocol CIFS is standard Windows protocol Standard operating system file calls are translated into
remot calls Distributed Information Systems (distributed naming
services) such as LDAP, DNS, NIS, Active Directory implement unified access to information needed for remote computing
File Protection29 / 69
File owner/creator should be able to control: what can be done (read, write, execute, ..) by whom (owner, others, group member, ..)
Types of access Read Write Execute Append Delete List
File Protection30 / 69
How it is done in Unix Mode of access: read, write, execute Three classes of users
RWXa) owner access 7 1 1 1
RWXb) group access 6 1 1 0
RWXc) public access 1 0 0 1
Ask manager to create a group (unique name), say G, and add some users to the group.
For a particular file (say game) or subdirectory, define an appropriate access.
File System Implementation32 / 69
File system design involves Defining File System Interface (Done!)
How file system looks to the userWhat is a file and its attributesWhat are the operationsDirectory structure to organize files
How that File System can be implemented (Now!)Design algorithmsDesign data structures (in-memo and on-disk
structures)Map logical file system to physical storage device (disk,
CD, ..)Mapping depends on the storage device
File System Implementation33 / 69
File control block: storage structure consisting of info about file (on disk)
Layered file system
file sys
file sys
file sys
device drivers
File System Implementation37 / 69
Now our problem is: how to map files on disk blocks? Don’t care about details (blocksector) underneath (driver
handles)
Block size is a multiple of sector size Sector size = 512 bytes; block size can be 1024 or 4096
bytes Unix: 4KB
File System Implementation38 / 69
Now our problem is: how to map files on disk blocks? 2 files occupying 2 (red) and 3 (blue) blocks
File System Implementation39 / 69
Now our problem is: how to map files on disk blocks? 2 files occupying 2 (red) and 3 (blue) blocks
File System Implementation40 / 69
Now our problem is: how to map files on disk blocks? 2 files occupying 2 (red) and 3 (blue) blocks Eventually
File System Implementation41 / 69
To implement a File System, file system software maintains major on-disk structures, conisiting of Boot control block: contains info needed by system to boot
OS from that volume (power on, 1st block of disk accessed (loaded into memory), a small program is run which knows where kernel is)
Volume control block: contains vol details (how many blocks, directory location, ..)
Directory structure: organizes files Per-file file control block (FCB) contains many details about
the file
File System Implementation42 / 69
File system gets commands from the processes and issues to the driver
Driver than works on the disk
Directory Implementation43 / 69
When you want to open/read/write/.. a file, you search the directory, for a given file name, to find the corresponding info of the file
Directory on disk; so fast search required
Linear list of file names with pointer to the data blocks Simple to program Time-consuming to execute
Hash table: linear list with hash data structure Decreases directory search time Collisions: situations where two file names hash to the same
location Only good if entries are fixed size, or use chained-overflow
method
Allocation MEthods44 / 69
How can we allocate disk space to files? Important issue! A disk is seen (to the file sys) as a sequence of blocks How disk blocks are allocated for files
Contiguous allocation
Linked allocation
Indexed allocation
Contiguous Allocation45 / 69
Each file occupies a set of contiguous blocks on the disk Simple: only starting location (block#) and length (num of
blocks) are required to find out the disk data blocks of file Random access is fast Waste of space (dynamic storage-alloc problem; ext.
fragmentation) Files cannot grow (can grow only if block10 is empty)
Contiguous Allocation46 / 69
Given an offset (byte X of file, or equivalenty, LA = X), what is the corresponding disk location?
Assume block size 1024 bytes Wheich disk block contains the byte 0 of file X (LA=0)? What is
the displacement on that block? Answeer: Disk block=6 & displacement (disk block offset) = 0
Which disk block contains the byteat LA=2500? Where is LA=2500 mappedon disk?
Answer: 2500/1024 = 2 2500%1024 = 452 Disk block = startAddr + 2 = 6 + 2 = 8 & displacement =
452
Extent-Based Systems48 / 69
Problem with contiguous allocation: file growing
A modified contiguous allocation scheme (newer systems)
Initially make a guess: size of my file will be X When file tries to grow more, alloc another extent and
point there
Extent-based file systems allocate disk blocks in extents
An extent is a contiguous block of disks Extents are allocated for file allocation A file consists of one or more extents
Linked Allocation49 / 69
Each file is a linked list of disk blocks: blocks may be noncontiguous
Block structure:
Linked Allocation50 / 69
File starts at block5
Now you can write some data in block5 If block5 filled up and more to write, find another empty block
(3)
Linked Allocation52 / 69
Simple: need only startingAddress Info to be maintined in the directory entry (or FCB) is just 5
(sli50) In contiguous alloc, we additionaly maintain num of blocks
(slid45) No waste of space (no external fragmentation problem) No random access (not easy)
Linked Allocation53 / 69
Given an offset (byte X of file, or equivalenty, LA = X), what is the corresponding disk location? Logical to physical mapping
dataSize = blockSize – pointerSize blockNumber = X / dataSize displacemnt (disk block offset) = X % dataSize Assume block size = 1024 bytes; pointer size = 4 bytes Assume file size = 4000 bytes Find the disk location corresponding to X=LA=2900? blockNum = 2900 / 1020 = 2 & 2900 % 1020 = 860 + 4 =
864
File Allocation Table (FAT)54 / 69
Prev, wasting a (small) space in each data block for the pointer part
Data size in a block is no longer a power of 2
Collect all these pointers into a table: File Allocation Table # entries in table = # blocks on the disk
Used in MS-DOS
Data size is a power of 2
File Allocation Table (FAT)55 / 69
All pointers collected into FAT
How many blocks allocated to this file?
File Allocation Table (FAT)56 / 69
You can load FAT into memory if it is not that big Then random access to file’ll be quite fast (unlike linked alloc
on disk)
Indexed Allocation57 / 69
Brings all pointers together into the index block (not linked anymore)
Logical view:
For sequential reading of the file from beginning to end (= access all the data blocks of the file), access index table, go from entry0 to N
Index table size = # blocks in file
Indexed Allocation59 / 69
Random access fast: just go to the 4th entry in table to access block3
No waste of space (no external fragmentation)
Mapping from logical file addresses to physical block numbers displacement into index table = LA / blockSize Learn disk block address from that entry Displacement into disk block = LA % blockSize
Block size 512 bytes; what can be the max size of a file mapped by Index table approach? Index table sits in one block (called index block): 512
entries in tabl Each index entry points to a data block of 512 bytes Using a single table, you can point 512 blocks
maxFileSize= 512x512bytes=256KB
Indexed Allocation60 / 69
For larger files, we need other index blocks
Instead of pointing to a disk block, outer index table points to another index table, which in turn points to many blocks (instead of just 1)
Indexed Allocation61 / 69
Unix uses indexed allocation and a small portion of the index table is kept in a structure called inode
For every file in a unix file system, we have a corresponding inode
A ptr to this inode structure is in directory entry associated w/ fname
indirects point to index blocks, which includes ptrs to data blcks
Free Space Management62 / 69
Done: Allocation of disk blocks to files How to locate the disk blocks allocated to a file
Now: Keep track of free blocks of the disk Necessary while growing a file (need free blocks)
Bit vector (bitmap) method Linked list method Grouping Counting
Free Space Management63 / 69
Bit vector method (used in Unix file system) We have a bit vector (bitmap) where we have one bit per
block indicating if the block is used or free
If the block is free, the corresponding bit is 1, else 0
Free Space Management65 / 69
Bit vector method Cons: Bitmap requires extra space (to be stored)
Block size 2^12 bytes Disk size 2^30 bytes = 1GB N = 2^30/2^12 = 2^18 blocks exist on disk
Hence we need 2^18 bits in bitmap, which makes 2^18 / 8 / 1024 = 32KB space just to store the bitmap
Pros: Easy to get contiguous files Pros: Blocks of a file can be kept close to each other (fseek)
Free Space Management66 / 69
Linked list method Each free block has pointer to the next free block We keep a pointer to the 1st free block (in some special loction
in disk) Cons: cannot get contiguous space easily Pros: no waste of space