43
CS 6560 Operating System Design Lecture 11 File Systems

CS 6560 Operating System Design

Embed Size (px)

DESCRIPTION

CS 6560 Operating System Design. Lecture 11 File Systems. References (please read). Web http://www.tldp.org/LDP/tlk/fs/filesystem.html http://www.tldp.org/HOWTO/Filesystems-HOWTO.html http://www.linuxjournal.com/article.php?sid=2108 Also available in BB - PowerPoint PPT Presentation

Citation preview

Page 1: CS 6560 Operating System Design

CS 6560 Operating System Design

Lecture 11

File Systems

Page 2: CS 6560 Operating System Design

References (please read)

Web– http://www.tldp.org/LDP/tlk/fs/filesystem.html– http://www.tldp.org/HOWTO/Filesystems-HOWTO.ht

ml– http://www.linuxjournal.com/article.php?sid=2108

Also available in BBKLEIMAN, S. R. 1986. Vnodes: An architecture for multiple file system types in Sun UNIX. In USENIX Conference Proceedings (June). USENIX, Berkeley, Calif., 238-247

Page 3: CS 6560 Operating System Design

Building Understanding

• An architecture describes components and connections among these components of a system.

• For many systems, there are multiple ways to view a system.

• Example: file systems

Page 4: CS 6560 Operating System Design

File System Levels of Abstraction

• User Level: Files and directories that a user sees in a hierarchical naming space

• Mounted Device Level: Collection of devices, each of which hold a separate file system

• File system Level: i-node tables and associated data blocks

• Block Level: An array of data blocks.• Physical Level: sectors spinning on a hard disk,

CDROM, DVD. (and even static data on a USB device)

Page 5: CS 6560 Operating System Design

Files

• Files are named units of persistent storage of information. (Here, persistent means that the names and the information does not disappear when the computer is shutdown and then turned back on.)

• Files can be structured, but we will follow the lead of Thompson and Ritchie, and only consider the case where they are arrays of bytes.

• Basic file operations consist of the classic Unix-like: open, close, read (so many bytes), write (so many bytes), seek, and execute.

• In Unix-like systems, files exist, independent of their names.

Page 6: CS 6560 Operating System Design

Thompson Model: Single Naming Space

• There is just one naming space in which files are named.

• It is structured as a rooted directed graph (digraph) whose vertices are the files and whose edges are directory links.

Page 7: CS 6560 Operating System Design

Thompson Model: Mounting

• This naming space is constructed by mounting devices together at directory links.

• Each device has its own file system.

• One file system serves as the root file system. Its root is the root of the entire digraph.

Page 8: CS 6560 Operating System Design

Thompson Model: File System

• Within each file system, file names are also structured as a rooted directed graph (digraph) whose vertices are the files and whose edges are directory links.

• Directories are files that supply and name the edges to this graph. Each directory contains a list of links that link a filename to an inode number.

• The inode number (or i-number) uniquely identifies the file within a particular device (filesystem). The pair of device

• Several filesystems can be mounted together to form the larger rooted digraph. The root of one file system is grafted to a directory in another.

Page 9: CS 6560 Operating System Design

Thompson Model: Internal Structure• Internally a Thompson file system consists of

– Boot block• Can contain a boot program

– Superblock• A block that contains management data for entire filesystem,

including a list of free inodes.

– I-node table• An indexed table of inodes (spans several blocks with 64

(originally) bytes per inode). Each inode is a reusable structure that contains attributes for one file, but no file name. Not all elements of the inode list are used. Some are free.

– Data blocks• Contains actual data

Page 10: CS 6560 Operating System Design

Classic Unix File System

Thompson Model of a filesystem

Page 11: CS 6560 Operating System Design

Structure of I-node

Page 12: CS 6560 Operating System Design

i-nodes for open files

Page 13: CS 6560 Operating System Design

Thompson Implementation of Directories

• Directory is a list of 16-byte directory entries

• Each directory entry consists of– i-node number (2 bytes)– name (14 characters max)

• Directories were files that could be read.

i-node number (2 bytes)

File Name (14 bytes)

Page 14: CS 6560 Operating System Design

Linux File systems

• Linux Virtual File System (today)

• Linux Ext2 (and Ext3) physical file systems

Page 15: CS 6560 Operating System Design

File System Case Study: Linux VFS• Linux Virtual File System

– Serves as a common abstract interface between the system call interface and the actual file systems.

– VFS provides uniform access to a large number of different real file systems such as MSDOS, VFAT, NTFS, Apple, OS/2, NFS, Ext2, ext3, …

– Supports special filesystems such as proc, pipefs, ramfs, tmpfs, sysfs

– Works with the buffer cache– Supported by slab caches for inodes and directory links.– Grew from Sun Microsystem’s Vnodes file system (1986) (see

KLEIMAN, S. R. 1986. Vnodes: An architecture for multiple file system types in Sun UNIX. In USENIX Conference Proceedings (June). USENIX, Berkeley, Calif., 238-247) (on BB)

Page 16: CS 6560 Operating System Design

Components of the FS

VFS

EXT2 VFAT NFSEXT3 NTFS proc

Buffer Cache

Disk Driver Disk Driver

DirectoryCache

InodeCache

Page 17: CS 6560 Operating System Design

Common File System Model

• VFS uses a common file system model in which every file is accessed in the same way.

• This modeled after the original concepts of Thompson and Ritchie’s Unix file system.

• It appears to users (via shells and application programs) as– Filesystems

– Files

– Directories

– Paths

– Symbolic links

Page 18: CS 6560 Operating System Design

Filesystems

• Each filesystem – Represents one storage device for files.

– Has a unique device id.

– Appears in two ways• as a directed graph whose vertices are its files and edges are

hard links. Internal nodes are called directories.

• as a list of its files, indexed by inode number.

– Has a root that is one of its own directories and when mounted has a mount point that points to a directory on another filesystem.

Page 19: CS 6560 Operating System Design

Files

• Files have type: regular file, directory, pipe, character device, block, …

• Files are essentially treated as linear arrays of bytes with open, close, read, write, seek, lock operations (although not every operation is available for each file).

• Each file has an inode that stores a set of attributes which include inode number, permissions, time stamps, ownership, and type (including the type directory and the type symbolic link).

• File attributes do not include the file name.

Page 20: CS 6560 Operating System Design

How these relate

• Files are organized in non overlapping filesystems.• Each file is associated with an inode number that

uniquely identifies it within its filesystem.

Page 21: CS 6560 Operating System Design

Directories• Directoties are special types of files. Each directory appears as an

internal node to some filesystem.

• Each directory has a set of directory entries consisting of two special internal links named “.” and “..” and all outgoing hard links.

• The entry “.” links a directory to itself.

• The entry “..” is an incoming link that specifies the parent directory. The parent directory has an entry which is outgoing link to this directory.

• Each outgoing link has a filename that is a string satisfying some rules regarding admissible characters and length. (These rules depend upon the filesystem type.)

• Each entry has an inode number that uniquely identifies a file within the same filesystem.

• Directories cannot be accessed with read and write operations, they are accessed through “opendir” and “readdir” operations. (This is a change from the original Thompson model.)

Page 22: CS 6560 Operating System Design

Paths• Paths can be used as arguments to operations such as open,

chdir, and mkdir.

• Paths consist of filenames separated by “/”s.

• They may begin with /, .,..,and ~.

• Paths are absolute (relative to the root), if they begin with “/”, or relative (relative to the current directory), if they don’t.

• Paths beginning with “/” and containing no “..” and “.” define the tree. (Any “.” is ignored, but initial “.” is treated much like an absolute path.) (example?)

• Paths may include symbolic links and mount points, as well as hard links (example?)

Page 23: CS 6560 Operating System Design

Symbolic Links

• Each Symbolic link is a file that connects a directory in one filesystem to a file in possibly different filesystem.

• Symbolic links relate a directory entry to a path.

Page 24: CS 6560 Operating System Design

Directory Tree

• The entire system of files is organized like a tree (digraph) for each process. Several nodes of this tree may share the same file due to multiple hard and symbolic links.

• Each process has two access points to this tree: a root directory and a current directory. These correspond to paths that begin with “/” and “.”

• System calls such as creat, open, chdir, mkdir, rm, rmdir, readdir operate on this tree.

Page 25: CS 6560 Operating System Design

Mount Operations

• The “mount” command (system call “mount”) mounts a filesystem specified by by a block special device file and by mount point. The “/etc/fstab” file assists with this and determines mounts at boot time.

• Mounting adds the filesystem and secures it to the mount point.

• The “umount” command (system call “umount”) unmounts a filesystem.

• Only the superuser can mount and umount filesystems.

Page 26: CS 6560 Operating System Design

VFS Internal Structure

• Linux VFS has the following internal objects that exist in the virtual memory of the kernel. These objects have operation lists that consist of function pointers to functions in the real file systems.– superblock objects– inode objects– file objects– dentry objects

• Linux VFS also has structure types that are not treated as objects.– vfsmount– file_system_type

Page 27: CS 6560 Operating System Design

VFS File Objects and Structures• superblock object

– Represents an entire mounted filesystem

• inode object– Represents a particular file in a mounted filesystem

• file object– Represents an instance of an opened file

• dentry object– Represents a path component = name, indoe

• vfsmount– Represents a mount point

• file_system_type– Represents a filesystem type

Page 28: CS 6560 Operating System Design

Superblock Objects• A superblock object stores information and operations on a mounted

file system.

• For disk-based file systems, this corresponds to the file system control block (FSCB) or superblock on the disk.

• Information includes: block size, maximum file size, filesystem type, disk synch status, flags, mount point, reference count.

• Operations act on the inodes (read, write, release, delete) in the filesystem and the superblock itself (release, write, get statistics). (see the textbook for details.)

• The superblocks are organized in a list headed by a global variable “super_blocks” and a list for each file system type.

• They can head lists of associated file objects and inode objects.

Page 29: CS 6560 Operating System Design

Superblock operations

The super_operations Structure

struct super_operations {

/* fill the structure */

void (*read_inode) (struct inode *);<\n>

int (*notify_change) (struct inode *,

struct iattr *);

void (*write_inode) (struct inode *);

void (*put_inode) (struct inode *);

void (*put_super) (struct super_block *);

void (*write_super) (struct super_block *);

void (*statfs) (struct super_block *,

struct statfs *, int);

int (*remount_fs) (struct super_block *,

int *, char *);

}

Page 30: CS 6560 Operating System Design

Inode Objects• An inode object stores information and operations about a specific file.

• Each inode object belongs to a mounted filesystem.

• Each inode is associated with an inode number that uniquely identifies it within its filesystem.

• Inode information includes file attributes such as time, ownership, size (but no filename).

• Operations include create new disk inode, lookup directory entry, create new inode object of various types, create a hard link, create a symbolic link, move files within filesystem, follow symbolic links, truncate files, check permissions.

• All inode objects are contained in a kernel virtual memory slab cache called the inode cache.

• Inode objects can head lists of dentries and buffers.

• Inode objects can point to block or character device drivers.

Page 31: CS 6560 Operating System Design

File Objects

• Each file object stores information and operations about an opened file and the process that opened it.

• Among the information stored in an an file object is a file position for reading and writing.

• File operations include seek, read, write, read directory, ioctl, poll, memory map, open, release, flush, synch with disk, lock (see the book for details).

• Each file object is associated with a dentry object and a mounted file system.

• Each file object can be part of a doubly linked list.

Page 32: CS 6560 Operating System Design

Dentry Objects• A dentry object stores information about a hard link.

• The dentry objects are organized in a container called the dentry cache.

• Memory for dentry objects are maintained by the memory manager’s slab allocator.

• Information in a dentry object includes the filename as it appears in the component of paths.

• Dentry objects have state: used, unused, and invalid.

• Each dentry object can point to an inode object and a superblock object.

• Dentry operations include revalidate, hash (for fast lookup in the dcache), name comparison, delete dentry, release dentry. (See the textbook for details.)

Page 33: CS 6560 Operating System Design

The Dentry Cache

• The dentry cache consists of dentry of objects organized in three ways– Active dentry objects organized like a tree with a root and parent-

child relationships maintained at each dentry object. This corresponds to the absolute paths without “..” and “.”.

– A least recently used list for memory management.

– A hash table that provides fast look up from path to dentry.

• The dentry cache provides a front end to an inode cache.

Page 34: CS 6560 Operating System Design

VFS and Processes

• Recall: Each process has an an entry in the process table called a process descriptor, implemented as “task_struct” type.

• The entire process table is implemented as a linked list and also as a hash table.

• Each process descriptor contains management (scheduling) information about the process and pointers to other structures including: tty (terminal driver), fs (virtual filesystem root and current directory entries), files (currently open files), mm (virtual memory descriptor), sig (signal handling info).

Page 35: CS 6560 Operating System Design

Process Descriptor

Process Descriptor

Scheduling info

Process hierarchy info

filesfs

mmsig

signal_struct

fs_struct

files_struct(Open Files)

mm_struct(Memory Descriptor)

Page 36: CS 6560 Operating System Design

fs_struct

• This table specifies the dentry objects of the process’ root and current directory.

• It also contains the process’ umask (a bit mask used to automatically turn off permissions when creating files).

Page 37: CS 6560 Operating System Design

Process Descriptor’s fs_struct

Process Descriptor

Scheduling info

Process hierarchy info

filesfs

mmsig

fs_struct

dentry object

dentry object

superblock object

superblock object

vfsmountstructure

vfsmountstructure

root

current

Page 38: CS 6560 Operating System Design

files_struct

• This table specifies the opened files of a process• It contains a pointer to an array of file objects,

indexed by the file descriptor, returned from creating the file or inherited.

• It contains bit maps to indicate which file objects are active and which are to be closed on exec.

• It also contains the current and max number of file objects and the number of processes sharing this table.

Page 39: CS 6560 Operating System Design

Open Files of a Process

Process Descriptor

Scheduling info

Process hierarchy info

filesfs

mmsig

files_struct(Open Files)

fd

file object dentry object inode object

file object dentry object inode object

file object dentry object inode object

file object dentry object inode object

open files

(has f_pos)

directory

link

represents

actual files

Page 40: CS 6560 Operating System Design

Relationships• The open files table may be shared by several processes. This happens

when processes share their address space (threads).

• Each open files table has a list of open files (file objects), indexed by file descriptor (returned from opening the file)

• Each open file (file object) can be shared by several open files tables and hence by several processes. This happens when a process forks. Parent and child share the same open files.

• Each open file (file object) has one dentry object.

• Each dentry object can be shared by several file objects. This happens with dup and some redirection.

• Each dentry object has one inode object.

• Each inode object can be shared by several dentry objects. This happens because of hard and symbolic links.

• Each inode has one superblock object, making it belong to one mounted filesystem.

Page 41: CS 6560 Operating System Design

file_system_type objects

• The system maintains a (linked) list of valid file types. Each file type is represented by a file_system_type object.

• This object has a name for the file system type.• This object has a method for creating a superblock

object. • It also points to the module that governs this file

type (if modulerized)• This object heads a list of superblock objects that

belong to this file type.

Page 42: CS 6560 Operating System Design

File system types

The file_system_type Structure

struct file_system_type { struct super_block *(*read_super) (struct super_block *, void *, int); const char *name; int requires_dev; /* there's a linked list of types */ /* struct file_system_type * next; /*}

Page 43: CS 6560 Operating System Design

Mounted Filesystems by Type

File_system_typeobject

superblockobject

superblockobject

File_system_typeobject

superblockobject

superblockobject

File_system_typeobject

superblockobject

superblockobject

dentry

of mount

dentry

of mount

dentry

of mount

dentry

of mount

dentry

of mount

dentry

of mount