69
CENG 334 – Operating Systems 07- File Systems Asst. Prof. Yusuf Sahillioğlu Computer Eng. Dept, , Turkey

CENG 334 – Operating Systems 07- File Systems Asst. Prof. Yusuf Sahillioğlu Computer Eng. Dept,, Turkey

Embed Size (px)

Citation preview

CENG 334 – Operating Systems

07- File Systems

Asst. Prof. Yusuf Sahillioğlu

Computer Eng. Dept, , Turkey

File Concept2 / 69

Users Applications/Processes We think and use files when we want to store something Can your code write something to the disk without using

files? NO

File is not a physical thing; it is an abstract entity Actual data we’re writing to a file sits in the storage media

(disk/CD)

OS provides that abstract entity via its File System component How to place files on physical media How to read/write ..

File Concept3 / 69

File concept illustration

Don’t deal with the disk directly Instead deal with the files (logical storage: byte 0 to n)

File Concept4 / 69

OS File System componenet is a software that views the hardware as a sequence of blocks of some size

Those blocks are mapped to sectors of the disk by Disk Driver Given a block number Disk Driver finds the corresponding

sector num

File Concept5 / 69

OS File System componenet views the underneath storage as a sequence of blocks

Applications are viewing the storage as a set of FILES Hardware has the storage as a set of sectors

File Concept6 / 69

OS File System componenet views the underneath storage as a sequence of blocks

Applications are viewing the storage as a set of FILES There has to be a mapping from FILES to blocks

File Concept7 / 69

A file is sitting in some blocks (contiguous or non-contiguous)

If a file content needs to occupy 4 blocks, OS File System Component decides which 4 blocks should contain the content

File Concept8 / 69

We will first see File System Interface (which functions provided)

Then we will see how those functions are implemented File block mapping

So we will understand OS File System Component thoroughly

File Concept9 / 69

File is just a contiguous logical address space (a storage) Actual content may be non-contiguous in the disk OS makes the arrangement so that you view the file as a

contiguous logical space/storage

What can we store in a file? Data

numericcharacterbinary

Program User’s (process) view of a file

File Structure10 / 69

None (no structure at all): sequence of words, bytes Unix, Windows

Simple record structure: sequence of records Lines Fixed length Variable length

Complex Structures Formatted document: understood by Word program, not

OS Executable file: understood by OS

Can simulate last two with first method by inserting appropriate control characters

Who decides: Operating system Program

File Attributes11 / 69

Name – only information kept in human-readable form Identifier – unique tag (number) identifies file within file

system Type – needed for systems that support different types Location – pointer to file location on device Size – current file size Protection – controls who can do reading, writing, executing Time, date, and user identification – data for protection,

security, and usage monitoring

Information about files are kept in the directory structure, which is maintained on the disk, not in memory

File Attributes12 / 69

There are 2 basic things stored on disk as part of the area controlled by the file system Files: storage content Directory info (can be a tree): info about files, attributes,

locationsOrganize files into a directory structure for efficient

accessOne entry per file: filename + pointer to attrbts

File Operations13 / 69

File is an abstract data type: A class with attributes and operations Create Write Read Reposition within file Delete Truncate

These ops are implemented by the File Sys Component of the OS

These ops are used by the application programmer

Open(Fi) – search the directory structure on disk for entry Fi, and move the content of entry to memory

Close (Fi) – move the content of entry Fi in memory to directory structure on disk

File Operations14 / 69

Efficient open-file manangement: open-file table Say a read-operation comes for an open file

Search directory on the disk (slow disk access) to locate entry Fi

Go to that location indicated by the entry

File Operations15 / 69

Efficient open-file manangement: open-file table Say a read-operation comes for an open file

Repeat searching the directory for each read operation: slow

OS maintains an open-file table for efficiency

File Operations16 / 69

Efficient open-file manangement: open-file table Only the open system call searches the directory, locates Fi,

caches it into open-file table, and returns the index of open-file table entry to the process: File Descriptor

File Operations17 / 69

Efficient open-file manangement: open-file table Subsequent operations (read, write, ..) can then use File

Descriptor handle without any further disk access

File Operations18 / 69

Efficient open-file manangement File pointer: pointer to last read/write location, per

process that has the file open File-open count: counter of number of times a file is

open – to allow removal of data from open-file table when last processes closes it

Disk location of the file: cache of data access information

Access rights: per-process access mode information

File Operations19 / 69

Each process has its own file-poisitoin pointer

System-wide open file table: location on disk & open-count same forall

File Types20 / 69

File Operations21 / 69

You can manually start the app that reads the file and load the file in it

Or OS automatically starts the app that reads the file (association)

File Access Methods22 / 69

Sequential Access (fscanf)read nextwrite next resetno read after last write

(rewrite) Direct Access

read nwrite nposition to n

read nextwrite next

rewrite n n = relative block number

File Access Methods23 / 69

Sequential Access

File Sharing24 / 69

Sharing of files on multi-user systems is desirable

Sharing may be done through a protection scheme

On distributed systems, files may be shared across a network

Network File System (NFS) is a common distributed file-sharing method

File Sharing25 / 69

User IDs identify users, allowing permissions and protections to be per-user

Group IDs allow users to be in groups, permitting group access rights

Attributes for protection

File Sharing26 / 69

Protection is based on the use of UserIDs and GroupIDs

Each file has associated protection bits (permissions) for userID and groupID userID: read, write, execute? groupID: read, write, execute?

File Sharing Remotely27 / 69

Idea: open, read, write, .. file as if it is a local file

File Sharing Remotely28 / 69

Uses networking to allow file system access between systems Manually via programs like FTP Automatically, seamlessly using distributed file systems Semi automatically via the world wide web

Client-server model allows clients to mount remote file systems from servers Server can serve multiple clients Client and user-on-client identification is insecure or

complicated NFS is standard UNIX client-server file sharing protocol CIFS is standard Windows protocol Standard operating system file calls are translated into

remot calls Distributed Information Systems (distributed naming

services) such as LDAP, DNS, NIS, Active Directory implement unified access to information needed for remote computing

File Protection29 / 69

File owner/creator should be able to control: what can be done (read, write, execute, ..) by whom (owner, others, group member, ..)

Types of access Read Write Execute Append Delete List

File Protection30 / 69

How it is done in Unix Mode of access: read, write, execute Three classes of users

RWXa) owner access 7 1 1 1

RWXb) group access 6 1 1 0

RWXc) public access 1 0 0 1

Ask manager to create a group (unique name), say G, and add some users to the group.

For a particular file (say game) or subdirectory, define an appropriate access.

File Protection31 / 69

A sample Unix directory listing

File System Implementation32 / 69

File system design involves Defining File System Interface (Done!)

How file system looks to the userWhat is a file and its attributesWhat are the operationsDirectory structure to organize files

How that File System can be implemented (Now!)Design algorithmsDesign data structures (in-memo and on-disk

structures)Map logical file system to physical storage device (disk,

CD, ..)Mapping depends on the storage device

File System Implementation33 / 69

File control block: storage structure consisting of info about file (on disk)

Layered file system

file sys

file sys

file sys

device drivers

File System Implementation34 / 69

Layers

File System Implementation35 / 69

Layers

File System Implementation36 / 69

Block to sector mapping by Disk Driver

File System Implementation37 / 69

Now our problem is: how to map files on disk blocks? Don’t care about details (blocksector) underneath (driver

handles)

Block size is a multiple of sector size Sector size = 512 bytes; block size can be 1024 or 4096

bytes Unix: 4KB

File System Implementation38 / 69

Now our problem is: how to map files on disk blocks? 2 files occupying 2 (red) and 3 (blue) blocks

File System Implementation39 / 69

Now our problem is: how to map files on disk blocks? 2 files occupying 2 (red) and 3 (blue) blocks

File System Implementation40 / 69

Now our problem is: how to map files on disk blocks? 2 files occupying 2 (red) and 3 (blue) blocks Eventually

File System Implementation41 / 69

To implement a File System, file system software maintains major on-disk structures, conisiting of Boot control block: contains info needed by system to boot

OS from that volume (power on, 1st block of disk accessed (loaded into memory), a small program is run which knows where kernel is)

Volume control block: contains vol details (how many blocks, directory location, ..)

Directory structure: organizes files Per-file file control block (FCB) contains many details about

the file

File System Implementation42 / 69

File system gets commands from the processes and issues to the driver

Driver than works on the disk

Directory Implementation43 / 69

When you want to open/read/write/.. a file, you search the directory, for a given file name, to find the corresponding info of the file

Directory on disk; so fast search required

Linear list of file names with pointer to the data blocks Simple to program Time-consuming to execute

Hash table: linear list with hash data structure Decreases directory search time Collisions: situations where two file names hash to the same

location Only good if entries are fixed size, or use chained-overflow

method

Allocation MEthods44 / 69

How can we allocate disk space to files? Important issue! A disk is seen (to the file sys) as a sequence of blocks How disk blocks are allocated for files

Contiguous allocation

Linked allocation

Indexed allocation

Contiguous Allocation45 / 69

Each file occupies a set of contiguous blocks on the disk Simple: only starting location (block#) and length (num of

blocks) are required to find out the disk data blocks of file Random access is fast Waste of space (dynamic storage-alloc problem; ext.

fragmentation) Files cannot grow (can grow only if block10 is empty)

Contiguous Allocation46 / 69

Given an offset (byte X of file, or equivalenty, LA = X), what is the corresponding disk location?

Assume block size 1024 bytes Wheich disk block contains the byte 0 of file X (LA=0)? What is

the displacement on that block? Answeer: Disk block=6 & displacement (disk block offset) = 0

Which disk block contains the byteat LA=2500? Where is LA=2500 mappedon disk?

Answer: 2500/1024 = 2 2500%1024 = 452 Disk block = startAddr + 2 = 6 + 2 = 8 & displacement =

452

Contiguous Allocation47 / 69

Extent-Based Systems48 / 69

Problem with contiguous allocation: file growing

A modified contiguous allocation scheme (newer systems)

Initially make a guess: size of my file will be X When file tries to grow more, alloc another extent and

point there

Extent-based file systems allocate disk blocks in extents

An extent is a contiguous block of disks Extents are allocated for file allocation A file consists of one or more extents

Linked Allocation49 / 69

Each file is a linked list of disk blocks: blocks may be noncontiguous

Block structure:

Linked Allocation50 / 69

File starts at block5

Now you can write some data in block5 If block5 filled up and more to write, find another empty block

(3)

Linked Allocation51 / 69

Linked Allocation52 / 69

Simple: need only startingAddress Info to be maintined in the directory entry (or FCB) is just 5

(sli50) In contiguous alloc, we additionaly maintain num of blocks

(slid45) No waste of space (no external fragmentation problem) No random access (not easy)

Linked Allocation53 / 69

Given an offset (byte X of file, or equivalenty, LA = X), what is the corresponding disk location? Logical to physical mapping

dataSize = blockSize – pointerSize blockNumber = X / dataSize displacemnt (disk block offset) = X % dataSize Assume block size = 1024 bytes; pointer size = 4 bytes Assume file size = 4000 bytes Find the disk location corresponding to X=LA=2900? blockNum = 2900 / 1020 = 2 & 2900 % 1020 = 860 + 4 =

864

File Allocation Table (FAT)54 / 69

Prev, wasting a (small) space in each data block for the pointer part

Data size in a block is no longer a power of 2

Collect all these pointers into a table: File Allocation Table # entries in table = # blocks on the disk

Used in MS-DOS

Data size is a power of 2

File Allocation Table (FAT)55 / 69

All pointers collected into FAT

How many blocks allocated to this file?

File Allocation Table (FAT)56 / 69

You can load FAT into memory if it is not that big Then random access to file’ll be quite fast (unlike linked alloc

on disk)

Indexed Allocation57 / 69

Brings all pointers together into the index block (not linked anymore)

Logical view:

For sequential reading of the file from beginning to end (= access all the data blocks of the file), access index table, go from entry0 to N

Index table size = # blocks in file

Indexed Allocation58 / 69

19: address of the disk block that is containing the index table of file

Indexed Allocation59 / 69

Random access fast: just go to the 4th entry in table to access block3

No waste of space (no external fragmentation)

Mapping from logical file addresses to physical block numbers displacement into index table = LA / blockSize Learn disk block address from that entry Displacement into disk block = LA % blockSize

Block size 512 bytes; what can be the max size of a file mapped by Index table approach? Index table sits in one block (called index block): 512

entries in tabl Each index entry points to a data block of 512 bytes Using a single table, you can point 512 blocks

maxFileSize= 512x512bytes=256KB

Indexed Allocation60 / 69

For larger files, we need other index blocks

Instead of pointing to a disk block, outer index table points to another index table, which in turn points to many blocks (instead of just 1)

Indexed Allocation61 / 69

Unix uses indexed allocation and a small portion of the index table is kept in a structure called inode

For every file in a unix file system, we have a corresponding inode

A ptr to this inode structure is in directory entry associated w/ fname

indirects point to index blocks, which includes ptrs to data blcks

Free Space Management62 / 69

Done: Allocation of disk blocks to files How to locate the disk blocks allocated to a file

Now: Keep track of free blocks of the disk Necessary while growing a file (need free blocks)

Bit vector (bitmap) method Linked list method Grouping Counting

Free Space Management63 / 69

Bit vector method (used in Unix file system) We have a bit vector (bitmap) where we have one bit per

block indicating if the block is used or free

If the block is free, the corresponding bit is 1, else 0

Free Space Management64 / 69

Bit vector method Search word by word for efficiency

Free Space Management65 / 69

Bit vector method Cons: Bitmap requires extra space (to be stored)

Block size 2^12 bytes Disk size 2^30 bytes = 1GB N = 2^30/2^12 = 2^18 blocks exist on disk

Hence we need 2^18 bits in bitmap, which makes 2^18 / 8 / 1024 = 32KB space just to store the bitmap

Pros: Easy to get contiguous files Pros: Blocks of a file can be kept close to each other (fseek)

Free Space Management66 / 69

Linked list method Each free block has pointer to the next free block We keep a pointer to the 1st free block (in some special loction

in disk) Cons: cannot get contiguous space easily Pros: no waste of space

Free Space Management67 / 69

Linked list method

Free Space Management68 / 69

Grouping A free block contains multiple free block pointers/addresses

Free Space Management69 / 69

Counting Besides the free block pointer, keep a counter saying how

many blocks are free contiguously after that free block