Upload
tamsin-waters
View
213
Download
0
Embed Size (px)
Citation preview
Reading
• For this lecture, you should have read Chapter 10 (Sections 1-5) and Chapter 11 (Sections 1-4).
NETW3005 (Operating Systems) Lecture 09 - File Systems Interface 2
Last Lecture – virtual memory
• Demand paging
• Page replacement algorithms
• Frame allocation
• Thrashing
NETW3005 (Operating Systems) Lecture 09 - File Systems Interface 3
This Lecture
• What’s a file?
• File access methods
• Directory structure
• File system implementation
• Disk allocation methods
NETW3005 (Operating Systems) Lecture 09 - File Systems Interface 4
Storage Management: A Recap
• The last two lectures have been concerned with moving data into and out of main memory.
• Note: primary memory is only temporary storage.
• Storage management is also concerned with the issue of storing data on non-volatile devices.
NETW3005 (Operating Systems) Lecture 09 - File Systems Interface 5
Motivation for the File Concept
• For many purposes, a programmer doesn’t care about what medium data is stored in.
• All they care about is the data itself, and how to get at it.
• The issue about how the data is stored can be left to the operating system.
NETW3005 (Operating Systems) Lecture 09 - File Systems Interface 6
Motivation for the File Concept
• The operating system provides a logical unit of storage for the user, called a file.
• The user refers to files.
• The operating system maps files onto regions of secondary storage.
• Files are really an artifact of the dialogue between the user and the O/S.
NETW3005 (Operating Systems) Lecture 09 - File Systems Interface 7
How should we define a file?
• What’s the point of having files?– That’s what we’ve just answered.
• What does a file hold?– A collection of related data, e.g. the
sequence of lines in a program, the sequence of words in a text document.
• Where is the data stored?– Secondary storage.
NETW3005 (Operating Systems) Lecture 09 - File Systems Interface 8
How should we define a file?
• What’s the point of having files?– That’s what we’ve just answered.
• What does a file hold?– A collection of related data, e.g. the
sequence of lines in a program, the sequence of words in a text document.
• Where is the data stored?– Secondary storage. (Probably more precise
to say ‘not in main memory’.)
NETW3005 (Operating Systems) Lecture 09 - File Systems Interface 9
How should we define a file?
• What structure does a file have?– Different files have different structures, e.g.
text files are broken into units with line breaks.
• What can a user do with a file?– Create, write, read, reposition, delete,
truncate. (These are all system calls.)
NETW3005 (Operating Systems) Lecture 09 - File Systems Interface 10
File attributes
• What information should the OS store about each file in the file system?– File name and type.– Location and size.– Protection.– Housekeeping information.
• Where is all this information kept?– In a directory.
NETW3005 (Operating Systems) Lecture 09 - File Systems Interface 11
File Operations
• To carry out an operation on a file, we have to know where it is.
• To avoid the overhead of searching every time, many systems require that a file is opened before using it.
• The system maintains an open file table which records the location of the file, and how it is currently being used.
NETW3005 (Operating Systems) Lecture 09 - File Systems Interface 12
Memory-mapped files
• Opening a memory-mapped file causes a region of a process’s virtual memory to be associated with the file.
• Reads and writes to the file are implemented as reads and writes to this memory region.
• Closing the file causes the region of memory to be written back to the disk.
NETW3005 (Operating Systems) Lecture 09 - File Systems Interface 13
File types
• Since files can store many different types of data, some systems require the type of data in a file to be specified explicitly.
• Some common file types:– executable programs,– source code and text,– application specific documents,– images, etc.
NETW3005 (Operating Systems) Lecture 09 - File Systems Interface 14
Advantages/disadvantages
• Advantages.– Knowing the file type limits the choice of
which applications can process that file.– Don’t attempt inapplicable operations, e.g.
printing out a binary file.
• Disadvantages.– Hard to deal with new file formats, e.g.
encrypted files (which are binary, but not executable).
NETW3005 (Operating Systems) Lecture 09 - File Systems Interface 15
File structure
• File types can be used to indicate the internal structure of a file.
• Any operating system has to know about one file format—executable prog-rams.
• O/S can usually support a larger set of types.
NETW3005 (Operating Systems) Lecture 09 - File Systems Interface 16
Strategies
• A minimal strategy, e.g. UNIX. A file is just a sequence of 8-bit bytes.
• An intermediate strategy, e.g. Mac-OS. A file consists of a resource fork and a data fork.
• An extreme strategy, e.g. MS Windows. Every file has an associated type embedded in its file name extension.
NETW3005 (Operating Systems) Lecture 09 - File Systems Interface 17
File access methods - Sequential
• Most common method.
• A file pointer identifies a record within the file.
• It can be moved incrementally forwards (in read or write operations) or to the beginning (in rewinding).
• The hardware metaphor is a tape.
NETW3005 (Operating Systems) Lecture 09 - File Systems Interface 18
File access methods - Direct
• A file is viewed as a numbered sequence of records.
• Operations (e.g. read, write) can be carried out on any record in any order.
• The hardware metaphor is a disk.
NETW3005 (Operating Systems) Lecture 09 - File Systems Interface 19
Organising groups of files
• Since the number of files in a system can be large, it makes sense to group them in various ways.
• Normally done in two levels. – The file system is first divided into
partitions. A partition can be thought of as a virtual disk.
– Each partition contains a directory of files that reside on it.
NETW3005 (Operating Systems) Lecture 09 - File Systems Interface 20
Organising groups of files
NETW3005 (Operating Systems) Lecture 09 - File Systems Interface 21
directory
directory
partition 1
partition 2
partition 1disc
directory
A simple model of user directories
• As we saw earlier, a directory is a table, relating a file name to its attributes.
• Simplest method uses a single table.
• But there are problems:– length and uniqueness of filenames,– multiple users,– searching large directories.
NETW3005 (Operating Systems) Lecture 09 - File Systems Interface 22
Multiple directories
• One directory per user.
• Have a master directory, which is a table of user directories.
• When a user process refers to a file, the operating system searches only the user’s directory.
NETW3005 (Operating Systems) Lecture 09 - File Systems Interface 23
Tree-structured directories
• It’s an easy extension to allow users to create new directories (aka folders in GUI-speak) in their own directories.
• To refer to an arbitrary file in the tree-structure, we now need to specify a path from the root of the tree, e.g.
NETW3005 (Operating Systems) Lecture 09 - File Systems Interface 24
/user2/directory1/directory2/file
NETW3005 (Operating Systems) Lecture 09 - File Systems Interface 25
/
user1 user2 user3
directory1 file
file directory2
file
File access in a directory hierarchy
• It would be laborious to have to type in the path name for each file.
• Instead, most systems provide a notion of the current directory as the default one to search.
• For executable files, some systems allow a user to specify a search path – a list of directories to be searched in order.
NETW3005 (Operating Systems) Lecture 09 - File Systems Interface 26
File access in a directory hierarchy
• Relative path names can also be used – these are interpreted relative to the current directory.
• Partitions are often thought of as the first branches in the tree.
• The syntax for specifying partitions is sometimes the same as for directory names.
NETW3005 (Operating Systems) Lecture 09 - File Systems Interface 27
Directories as graphs
• One type of data-sharing is to allow two users access to the same file/directory.
• If we implement this by having two directories point to the same file/directory then the resulting structure is a graph.
• Such a graph must be acyclic.
NETW3005 (Operating Systems) Lecture 09 - File Systems Interface 28
Shared directories
• Commonly implemented using links.
• A link is a pointer to an arbitrary file in the directory structure.
• In a symbolic link, the pointer is just a pathname.
• When a directory is searched and a link found, the O/S follows the pointer and uses the file pointed to.
NETW3005 (Operating Systems) Lecture 09 - File Systems Interface 29
Mounting
• To make a file system available to processes, it must be mounted.
• The operating system is given the name of the device to be mounted and a directory from which the file system will be accessible, e.g. /user1.)
• The files in the mounted system will then be available as if they were files in that directory, e.g. /user1/newfile.)
NETW3005 (Operating Systems) Lecture 09 - File Systems Interface 30
Implementing files: a file system
• So far, we have been describing the logical structure of the file system, (files, directories, partitions.)
• The operating system has to map this logical structure onto a storage device (typically, a disk).
• This is done by the file organisation module.
NETW3005 (Operating Systems) Lecture 09 - File Systems Interface 31
Implementing files: a file system
• We have already seen how the smallest unit in a disk is a block.
• The file organisation module has to allocate blocks for the storage of files.
• A file is broken into logical blocks, to make the mapping to disk blocks easier to manage.
NETW3005 (Operating Systems) Lecture 09 - File Systems Interface 32
Disk allocation methods
• How should we set aside disk space for the files in a system to occupy?
• There are several allocation methods.– Contiguous allocation– Linked Allocation– File Allocation Table– Indexed Allocation
NETW3005 (Operating Systems) Lecture 09 - File Systems Interface 33
Contiguous allocation.
• Each file occupies a set of contiguous blocks on the disk.
NETW3005 (Operating Systems) Lecture 09 - File Systems Interface 34
Advantages?
• Sequential access?– Good, because the next character to read
is very close.
• Random access? – Good, because you can just count the
number of blocks.
NETW3005 (Operating Systems) Lecture 09 - File Systems Interface 35
Disadvantages?
• External fragmentation.• Notice that compaction is an option; but it
needs to be done off-line.
• Internal fragmentation.• Files can grow and shrink.
NETW3005 (Operating Systems) Lecture 09 - File Systems Interface 36
Linked Allocation
• Each file is a linked list of disk blocks.
• The directory contains a pointer to the first (and last) blocks of the file.
• Each block of the file contains a pointer to the next block.
NETW3005 (Operating Systems) Lecture 09 - File Systems Interface 37
Advantages/disadvantages
• Advantages?– No external fragmentation. – Files can be arbitrarily big; no need to pre-
allocate
• Disadvantages?– Can only be used effectively for sequential files.– Pointers take up some space.– Internal fragmentation.– Reliability.
NETW3005 (Operating Systems) Lecture 09 - File Systems Interface 38
File Allocation Table (FAT)
• A variant on the linked allocation scheme, used in MS-DOS, OS/2.
• A table is created at the beginning of each partition, with an entry for each block in the partition.
• The directory entry for a file specifies the block number for the first block of the file.
• The value of the FAT entry for the first block will identify the block number of the next block in the file.
NETW3005 (Operating Systems) Lecture 09 - File Systems Interface 39
Advantages/disadvantages
• Advantages?– Same as linked. In addition, direct access
is better supported, because chaining through the FAT is faster than chaining through a linked list.
• Disadvantages?– Same as linked. Even more head seeks, in
fact, unless the FAT is cached.
NETW3005 (Operating Systems) Lecture 09 - File Systems Interface 40
Indexed Allocation
• Each file has an index block, containing a table specifying the physical block for each logical block.
• The directory entry for a file contains the address of its index block.
NETW3005 (Operating Systems) Lecture 09 - File Systems Interface 41
Advantages/disadvantages
• Advantages?– Very easy direct access.– No external fragmentation.
• Disadvantages?– Wasted space - internal fragmentation, really.– We have to allocate a large array for the
index, because we don’t know how big the index needs to be. ALSO lots of head seeks, like linked allocation.
NETW3005 (Operating Systems) Lecture 09 - File Systems Interface 42
Extending a full index block
• UNIX’s index block (called an inode), combines direct indexing with multilevel indexing.
NETW3005 (Operating Systems) Lecture 09 - File Systems Interface 43
triple indirectdouble indirect
modeownerstimestampsize blockcountdirect blocks
single indirect
NETW3005 (Operating Systems) Lecture 09 - File Systems Interface 44
data
data
data data
data data
data
data
data