disk scheduling algorithms in os

Embed Size (px)

Citation preview

  • 8/19/2019 disk scheduling algorithms in os

    1/32

    CHAPTER 1

    INTRODUCTION

    1.1 Scheduling

    An operating system is a program that manages the hardware and software

    resources of a computer. It is the first thing that is loaded into memory when we turn on

    the computer. Without the operating system, each programmer would have to create a

    way in which a program will display text and graphics on the monitor. The programmer 

    would have to create a way to send data to a printer, tell it how to read a disk file, and

    how to deal with other programs. In the beginning, programmers needed a way to handle

    complex input/output operations. The evolution of computer programs and their 

    complexities reuired new necessities. !ecause machines began to become more

     powerful, the time a program needed to run decreased. "owever, the time needed for 

    handing off the euipment between different programs became evident and this led to

     programs like #$%. As we can see the acronym #$% stands for #isk $perating %ystem.

    This confirms that operating systems were originally made to handle these complex

    input/output operations like communicating among a variety of disk drives. &arlier 

    computers were not as powerful as they are today. In the early computer systems you

    would only be able to run one program at a time. 'or instance, you could not be writing a

     paper and browsing the internet all at the same time. "owever, today(s operating systems

    are very capable of handling not only two but multiple applications at the same time. In

    fact, if a computer is not able to do this it is considered useless by most computer users.

    In order for a computer to be able to handle multiple applications simultaneously, there

    must be an effective way of using the )*+. %everal processes may be running at the same

    time, so there has to be some kind of order to allow each process to get its share of )*+

    time. An operating system must allocate computer resources among the potentially

    competing reuirements of multiple processes. In the case of the processor, the resource

    to be allocated is execution time on the processor and the means of allocation is

    scheduling. The scheduling function must be designed to satisfy a number of obectives,

    including fairness, lack of starvation of any particular process, efficient use of processor 

    time, and low overhead. In addition, the scheduling - function may need to take into

    account different levels of priority or realtime deadlines for the start or completion of 

    certain processes. $ver the years, scheduling has been the focus of intensive research, and

    many different algorithms have been implemented. Today, the emphasis in scheduling

    1

  • 8/19/2019 disk scheduling algorithms in os

    2/32

    research is on exploiting multiprocessor systems, particularly for multithreaded

    applications, and realtime scheduling. In a multiprogramming system, multiple processes

    exist concurrently in main memory. &ach process alternates between using a processor 

    and waiting for some event to occur, such as the completion of an I/$ operation. The

     processor or processors are kept busy by executing one process while the others wait,hence the key to multiprogramming is scheduling.

    1.2 Disk Scheduling

      $ne of the responsibilities of the operating system is to use the hardware

    efficiently. for the disk drives, meeting this responsibility entails having a fast access time

    and disk bandwidth. The access time has two maor components.

    -. The %eek time

    . The 0otational latency

    The seek time is the time for the disk arm to move the heads to the cylinder 

    concertinaing the desired sector. The rotational latency is the additional time waiting for 

    the disk to rotate the desired sector to the disk head.

     The disk bandwidth is the total number of bytes transferred, divided by the total time

     between the first reuest for service and the completion of the last transfer. We can

    improve both the access time and the bandwidth by scheduling the servicing of disk I/$

    reuest in a good order.

    Whenever a process needs I/$ to or from the disk, it issues a system call to the

    operating system. The reuest specifies several pieces of information. They are 1

    •Whether this operation is input or output.

    • What the disk address for the transfer is.

    • What the memory address for the transfer is.

    • What the number of bytes to be transferred is.

    If the desired disk drive and controller are available, the reuest can be serviced

    immediately. If the drive or controller is busy, any new reuests for service will be placed

    on the ueue of pending reuests for that drive. 'or a multi programming system with

    2

  • 8/19/2019 disk scheduling algorithms in os

    3/32

    many processes, the disk ueue may often have several pending reuests. Thus, when one

    reuest is completed, the operating system chooses which pending reuest to service next.

    CHAPTER 2

    DISK SCHEDULING ALGORITHS

    2.1 Types of Scheduling Algorithms

    Although there are other algorithms that reduce the seek time of all reuests, wewill concentrate on the following disk scheduling algorithms1

    'irst )ome 'irst %erve 2')'%3

    %hortest %eek Time 'irst 2%%T'3

    &levator 2%)A43

    )ircular %)A4 2)%)A43

    5$$6 

    These algorithms are not hard to understand, but they can confuse someone because they

    are so similar. What we are striving for by using these algorithms is keeping "ead

    7ovements 28 tracks3 to the least amount as possible. The less the head has to move the

    faster the seek time will be.

    2.1.1 First Come First Serve (FCFS)

    The simplest form of disk scheduling is, of course, the firstcome, firstserved

    2')'%3 Algorithm. This algorithm is intrinsically fair, but it generally does not provide

    the fastest service. )onsider, for &x, A disk ueue with reuests for I/$ to blocks on

    cylinders.

    9:, -:;, ;?, ><

    in that order. If the disk head is initially at cylinder ?;, it will be first move from ?; to 9:,

    then to -:;,;?, and finally to >=@

    cylinders. This schedule is diagrammed in figure.

    3

  • 8/19/2019 disk scheduling algorithms in os

    4/32

    ueue 9:, -:;, ;?, ><

    head starts at ?;

     

    0 14 37 536567 98 122 124

    183 199

     

    SHOT!ST S!!" T#$! F#ST (S S T F)

    .-.- 'irst )ome 'irst %erve %cheduling

    2.1.2 Sh!"#es# Seek Ti$e %i"s# Scheduling &SST%'

    It seems reasonable to service all the reuests close to the current head position,

     before moving the head for away to service other reuests. This assumption is the basis

    for the shortestseektimefirst 2%%T'3 algorithm. The %%T' algorithm selects the reuest

    4

  • 8/19/2019 disk scheduling algorithms in os

    5/32

    with the minimum seek time from the current head position. %ince seek time increases

    with the number of cylinders traversed head position.

    'or our example reuest ueue, the closest reuest to the initial head position2?;3

    is at cylinder >?. $nce we are at cylinder >?, the next closest reuest is at cylinder >

  • 8/19/2019 disk scheduling algorithms in os

    6/32

    ueue 9:, -:;, ;?, ><

    head starts at ?;

    0 14 37 536567 98 122 124

    183 199

     

    SHOT!ST S!!" T#$! F#ST (S S T F)

    .-. %hortest %eek Time 'irst %cheduling

    2.1.3 SCAN Scheduling

    In the %)A4 algorithm, the disk arm starts at one end of the disk, and moves

    toward the other end, servicing reuests as it reaches each cylinder, until it gets toward

    the other end of the disk. At the other end, the direction of head movement is reversed,

    6

  • 8/19/2019 disk scheduling algorithms in os

    7/32

    and servicing continues. The head continuously scans back and forth across the disk. We

    again use our example.

    !efore applying %)A4 to schedule the reuests on cylinders 9:,-:;, ;?, and >?,>

  • 8/19/2019 disk scheduling algorithms in os

    8/32

    ueue 9:, -:;, ;?, ><

    head starts at ?;

    0 14 37 53 65 67 98 122 124

    183 199

      %)A4 %"+5I4E

    .-.; %)A4 %cheduling

    2.1.4 C-SCAN Scheduling (C SCAN)

      Circular scan (C-SCAN) scheduling is a arian! "# SCAN

    designed !" $r"ide a %"re uni#"r% &ai! !i%e. 'ie SCAN C-

    SCAN %"es !he head #r"% "ne end "# !he dis !" !he "!her

    sericing re*ues!s al"ng !he &a+. ,hen !he head reaches !he"!her end h"&eer i! i%%edia!el+ re!urns !" !he eginning

    8

  • 8/19/2019 disk scheduling algorithms in os

    9/32

    "# !he dis &i!h"u! sericing an+ re*ues!s "n !he re!urn !ri$.

     he C-SCAN scheduling alg"ri!h% essen!iall+ !rea!s !he

    c+linders as a circular lis! !ha! &ra$s ar"und #r"% !he /nal

    c+linder !" !he /rs! "ne.

    ueue 9:, -:;, ;?, ><

    head starts at ?;

      0 14 37 536567 98 122 124183 199

    .-.= ) %)A4 %cheduling

    2.1.5 ' Scheduling

    As we described them, both %)A4 and )%)A4 move the disk arm across the

    full width of the disk. In practice, neither algorithm is implemented this way. 7ore

    commonly, the arm goes only as the final reuest in each direction.

    9

  • 8/19/2019 disk scheduling algorithms in os

    10/32

      Then, it reverses direction immediately, without going all the way to the end of the

    disk. These versions of %)A4 and )%)A4 are called 5$$6 and )5$$6 scheduling,

     because they look for a reuest before continuing to move in a given direction.

    ueue 9:, -:;, ;?, ><

    head starts at ?;

     

    0 14 37 53 6567 98 122 124

    183199

     

    .-.? 5$$6 %cheduling

    2.2 Scheduling Criteri% 

    #ifferent )*+ scheduling algorithms have different properties, and the choice of a

     particular algorithm may favor one class of processes over another. In choosing which

    algorithm to use in a particular situation, we must consider the properties of the various

    algorithms. 7any criteria have been suggested for comparing )*+ scheduling

    algorithms. Which characteristics are used for comparison can make a substantialdifference in which algorithm is udged to be best. The criteria include the following1

    10

  • 8/19/2019 disk scheduling algorithms in os

    11/32

    F CPU u#ili()#i!n* We want to keep the )*+ as busy as possible. )onceptually, )*+

    utiliGation can range from @ to - @@ percent. In a real system, it should range from =@

     percent 2for a lightly loaded system3 to 9@ percent 2for a heavily used system3.

     F Th"!ugh+u#* If the )*+ is busy executing processes, then work is being done. $ne

    measure of work is the number of processes that are completed per time unit, called

    throughput. 'or long processes, this rate may be one process per hourC for shorttransactions, it may be -@ processes per second.

     F Tu"n)"!und #i$e* 'rom the point of view of a particular process, the important

    criterion is how long it takes to execute that process. The interval from the time of

    submission of a process to the time of completion is the turnaround time. Turnaround

    time is the sum of the periods spent waiting to get into memory, waiting in the ready

    ueue, executing on the )*+, and doing I/$.

    F ,)i#ing #i$e* The )*+ scheduling algorithm does not affect the amount of time during

    which a process executes or does I/$C it affects only the amount of time that a processspends waiting in the ready ueue. Waiting time is the sum of the periods spent waiting in

    the ready ueue.

    F Res+!nse #i$e* In an interactive system, turnaround time may not be the best criterion.

    $ften, a process can produce some output fairly early and can continue computing new

    results while previous results are being output to the user. Thus, another measure is the

    time from the submission of a reuest until the first response is produced. This measure,

    called response time , is the time it takes to start responding, not the time it takes to output

    the response. The turnaround time is generally limited by the speed of the output device.

    = It is desirable to maximiGe )*+ utiliGation and throughput and to minimiGe turnaround

    time, waiting time, and response time. The commonly used criteria can be categoriGed

    along two dimensions too. 'irst, we can make a distinction between useroriented and

    systemoriented criteria. +ser oriented criteria relate to the behavior of the system as

     perceived by the individual user or process. An example is response time i.e., elapsed

    time between the submissions of a reuest until the response begins to appear as output.

    This uantity is visible to the user and is naturally of interest to the user. $ther criteria are

    system oriented i.e., the focus is on effective and efficient utiliGation of the processor. An

    example is throughput, which is the rate at which processes are completed. This is

    certainly a worthwhile measure of system performance and one that we would like tomaximiGe. "owever, it focuses on system performance rather than service provided to the

    user. Thus, throughput is of concern to a system administrator but not to the user

     population. Another dimension along which criteria can be classified is those that are

     performance related and those that are not performance related. *erformance related

    criteria are uantitative and generally can be readily measured. &xamples include

    response time and throughput. )riteria that are not performance related are either

    ualitative in nature or do not lend themselves readily to measurement and analysis. An

    example of such a criterion is predictability. The following table summariGes key

    scheduling criteria. These are interdependent, and it is impossible to optimiGe all of them

    simultaneously e.g., providing good response time may reuire a scheduling algorithm

    that switches between processes freuently. This increases the overhead of the system,

    11

  • 8/19/2019 disk scheduling algorithms in os

    12/32

    reducing throughput. Thus, the design of a scheduling policy involves compromising

    among competing reuirements.

    2.- Pe"!"$)nce C!$+)"is!n 

    )learly, the performance of various scheduling policies is a critical factor in the

    choice of a scheduling policy. "owever, it is impossible to make definitive comparisons

     because relative performance will depend on a variety of factors, including the probability

    distribution of service times of the various processes, the efficiency of the scheduling and

    context switching mechanisms, and the nature of the I/$ demand and the performance of

    the I/$ subsystem.

    2./ Selec#i!n ! ) Disk0Scheduling Alg!"i#h$

    %%T' is common and has a natural appeal. %)A4 and )%)A4 perform better for 

    systems that place a heavy load on the disk. *erformance depends on the number and

    types of reuests. 0euests for disk service can be influenced by the fileallocation

    method. The diskscheduling algorithm should be written as a separate module of the

    operating system, allowing it to be replaced with a different algorithm if necessary. &ither 

    %%T' or 5$$6 is a reasonable choice for the default algorithm.

    12

  • 8/19/2019 disk scheduling algorithms in os

    13/32

    CHAPTER -

    ASS STORAGE SSTES

    -.1 Oe"ie3 ! )ss S#!")ge S#"uc#u"e

    7agnetic disks provide bulk of secondary storage of modern computers. #rives

    rotate at >@ to @@ times per second. Transfer rate is rate at which data flow between drive

    and computer. *ositioning time 2randomaccess time3 is time to move disk arm to desired

    cylinder 2seek time3 and time for desired sector to rotate under the disk head 2rotational

    latency3 #isks can be removable #rive attached to computer via I/$ bus. !uses vary,

    including &I#&, ATA, %ATA, +%!, 'iber )hannel, %)%I. "ost controller in computer 

    uses bus to talk to disk controller built into drive or storage array.

    ;.- 7ovinghead #isk 7echanism

    13

  • 8/19/2019 disk scheduling algorithms in os

    14/32

    -.1.1 )gne#ic Disks

    Traditional magnetic disks have the following basic structure1 $ne or more

     platters in the form of disks covered with magnetic media. "ard disk platters are made of 

    rigid metal, while HfloppyH disks are made of more flexible plastic. &ach platter has two

    working surfaces. $lder hard disk drives would sometimes not use the very top or bottom

    surface of a stack of platters, as these surfaces were more susceptible to potential damage.

    &ach working surface is divided into a number of concentric rings called tracks. The

    collection of all tracks that are the same distance from the edge of the platter, 2i.e. all

    tracks immediately above one another in the following diagram3 is called a cylinder. &ach

    track is further divided into sectors, traditionally containing ?- bytes of data each,

    although some modern disks occasionally use larger sector siGes. 2%ectors also include a

    header and a trailer, including checksum information among other things. 5arger sector 

    siGes reduce the fraction of the disk consumed by headers and trailers, but increase

    internal fragmentation and the amount of disk that must be marked bad in the case of 

    errors.3 The data on a hard drive is read by readwrite heads. The standard configuration

    uses one head per surface, each on a separate arm, and controlled by a common arm

    assembly which moves all heads simultaneously from one cylinder to another. 2$ther 

    configurations, including independent readwrite heads, may speed up disk access, but

    involve serious technical difficulties.3 The storage capacity of a traditional disk drive iseual to the number of heads 2i.e. the number of working surfaces3, times the number of 

    tracks per surface, times the number of sectors per track, times the number of bytes per 

    sector. A particular physical block of data is specified by providing the headsector

    cylinder number at which it is located. In operation the disk rotates at high speed, such as

  • 8/19/2019 disk scheduling algorithms in os

    15/32

    overall transfer rate, including seek time and rotational latency as well as the electronic

    data transfer rate3.

    #isk heads HflyH over the surface on a very thin cushion of air. If they should

    accidentally contact the disk, then a head crash occurs, which may or may not

     permanently damage the disk or even destroy it completely. 'or this reason it is normal to

     park the disk heads when turning a computer off, which means to move the heads off thedisk or to an area of the disk where there is no data stored.

    'loppy disks are normally removable. "ard drives can also be removable, and

    some are even hotswappable, meaning they can be removed while the computer is

    running, and a new hard drive inserted in their place.

    #isk drives are connected to the computer via a cable known as the I/$ !us. %ome

    of the common interface formats include &nhanced Integrated #rive &lectronics, &I#&C

    Advanced Technology Attachment, ATAC %erial ATA, %ATA, +niversal %erial !us, +%!C

    'iber )hannel, '), and %mall )omputer %ystems Interface, %)%I.

    The host controller is at the computer end of the I/$ bus, and the disk controller is

     built into the disk itself. The )*+ issues commands to the host controller via I/$ ports.

    #ata is transferred between the magnetic surface and onboard cache by the disk 

    controller, and then the data is transferred from that cache to the host controller and the

    motherboard memory at electronic speeds.

    -.1.2 S!lid0S#)#e DisksAs technologies improve and economics change, old technologies are often used

    in different ways. $ne example of this is the increasing used of solid state disks, or %%#s.

    %%#s use memory technology as a small fast hard disk. %pecific implementations may

    use either flash memory or #0A7 chips protected by a battery to sustain the information

    through power cycles. !ecause %%#s have no moving parts they are much faster than

    traditional hard drives, and certain problems such as the scheduling of disk accesses

    simply do not apply. "owever %%#s also have their weaknesses1 They are more

    expensive than hard drives, generally not as large, and may have shorter life spans. %%#s

    are especially useful as a highspeed cache of harddisk information that must be accesseduickly. $ne example is to store filesystem metadata, e.g. directory and inode

    information, that must be accessed uickly and often. Another variation is a boot disk 

    containing the $% and some application executables, but no vital user data. %%#s are also

    used in laptops to make them smaller, faster, and lighter. !ecause %%#s are so much

    faster than traditional hard disks, the throughput of the bus can become a limiting factor,

    causing some %%#s to be connected directly to the system *)I bus for example.

    -.1.- )gne#ic T)+es

    15

  • 8/19/2019 disk scheduling algorithms in os

    16/32

    7agnetic tapes were once used for common secondary storage before the days of 

    hard disk drives, but today are used primarily for backups. Accessing a particular spot on

    a magnetic tape can be slow, but once reading or writing commences, access speeds are

    comparable to disk drives. )apacities of tape drives can range from @ to @@ E!, and

    compression can double that capacity.

    -.2 Disk S#"uc#u"e

    #isk drives are addressed as large -dimensional arrays of logical blocks, where

    the logical block is the smallest unit of transfer. The -dimensional array of logical blocks

    is mapped into the sectors of the disk seuentially. %ector @ is the first sector of the first

    track on the outermost cylinder. 7apping proceeds in order through that track, then the

    rest of the tracks in that cylinder, and then through the rest of the cylinders from

    outermost to innermost. The traditional headsectorcylinder, "%) numbers are mappedto linear block addresses by numbering the first sector on the first head on the outermost

    track as sector @. 4umbering proceeds with the rest of the sectors on that same track, and

    then the rest of the tracks on the same cylinder before proceeding through the rest of the

    cylinders to the center of the disk. In modern practice these linear block addresses are

    used in place of the "%) numbers for a variety of reasons. The linear length of tracks

    near the outer edge of the disk is much longer than for those tracks located near the

    center, and therefore it is possible to sueeGe many more sectors onto outer tracks than

    onto inner ones. All disks have some bad sectors, and therefore disks maintain a few spare

    sectors that can be used in place of the bad ones. The mapping of spare sectors to bad

    sectors in managed internally to the disk controller. 7odern hard drives can have

    thousands of cylinders, and hundreds of sectors per track on their outermost tracks. These

    numbers exceed the range of "%) numbers for many 2older3 operating systems, and

    therefore disks can be configured for any convenient combination of "%) values that

    falls within the total number of sectors physically on the drive. There is a limit to how

    closely packed individual bits can be placed on a physical media, but that limit is growing

    increasingly more packed as technological advances are made.

    • 7odern disks pack many more sectors into outer cylinders than inner ones, using

    one of two approaches1

    o With )onstant 5inear  elocity , )5 , the density of bits is uniform from

    cylinder to cylinder. !ecause there are more sectors in outer cylinders, the

    disk spins slower when reading those cylinders, causing the rate of bits

     passing under the readwrite head to remain constant. This is the approach

    used by modern )#s and ##s.

    • With )onstant Angular  elocity , )A , the disk rotates at a constant angular

    speed, with the bit density decreasing on outer cylinders. 2These disks would have

    a constant number of sectors per track on all cylinders3.

    16

  • 8/19/2019 disk scheduling algorithms in os

    17/32

    -.- H!s# A##)ched S#!")ge

    5ocal disks are accessed through I/$ *orts as described earlier. The most common

    interfaces are I#& or ATA, each of which allow up to two drives per host controller.

    %ATA is similar with simpler cabling. "igh end workstations or other systems in need oflarger number of disks typically use %)%I disks1 The %)%I standard supports up to ->

    targets on each %)%I bus, one of which is generally the host adapter and the other -? of

    which can be disk or tape drives. A %)%I target is usually a single drive, but the standard

    also supports up to : units within each target. These would generally be used for

    accessing individual disks within a 0AI# array. The %)%I standard also supports multiple

    host adapters in a single computer, i.e. multiple %)%I busses. 7odern advancements in

    %)%I include HfastH and HwideH versions, as well as %)%I. %)%I cables may be either ?@

    or >: conductors. %)%I devices may be external as well as internal. ') is a highspeed

    serial architecture that can operate over optical fiber or fourconductor copper wires, andhas two variants. A large switched fabric having a =bit address space. This variant

    allows for multiple devices and multiple hosts to interconnect, forming the basis for the

    storagearea networks, %A4s, to be discussed in a future section. The arbitrated loop, ')

    A5, that can address up to -> devices 2drives and controllers3.

    -./ Ne#3!"k A##)ched S#!")ge

     4etwork attached storage connects storage devices to computers using a remote

     procedure call, 0*), interface, typically with something like 4'% filesystem mounts.This is convenient for allowing several computers in a group common access and naming

    conventions for shared storage. 4A% can be implemented using %)%I cabling, or I%)%I

    uses Internet protocols and standard network connections, allowing longdistance remote

    access to shared files. 4A% allows computers to easily share data storage, but tends to be

    less efficient than standard hostattached storage.

    ;.= 4etworkattached storage

    17

  • 8/19/2019 disk scheduling algorithms in os

    18/32

    -.4 S#!")ge A"e) Ne#3!"k 

    A %torageArea 4etwork, %A4, connects computers and storage devices in a

    network, using storage protocols instead of network protocols. $ne advantage of this is

    that storage access does not tie up regular networking bandwidth. %A4 is very flexibleand dynamic, allowing hosts and devices to attach and detach on the fly. %A4 is also

    controllable, allowing restricted access to certain hosts and devices.

    ;.? %toragearea network 

    -.5 Disk %!"$)##ing

    !efore a disk can be used, it has to be lowlevel formatted, which means laying

    down all of the headers and trailers marking the beginning and ends of each sector.

    Included in the header and trailer are the linear sector numbers, and errorcorrecting

    codes, &)), which allow damaged sectors to not only be detected, but in many cases for 

    the damaged data to be recovered 2depending on the extent of the damage.3 %ector siGes

    are traditionally ?- bytes, but may be larger, particularly in larger drives. &))

    calculation is performed with every disk read or write, and if damage is detected but the

    data is recoverable, then a soft error has occurred. %oft errors are generally handled by the

    onboard disk controller, and never seen by the $%. $nce the disk is lowlevel formatted,

    the next step is to partition the drive into one or more separate partitions. This step must

     be completed even if the disk is to be used as a single large partition, so that the partition

    table can be written to the beginning of the disk. After partitioning, then the filesystems

    must be logically formatted, which involves laying down the master directory information2'AT table or inode structure3, initialiGing free lists, and creating at least the root directory

    18

  • 8/19/2019 disk scheduling algorithms in os

    19/32

    of the filesystem. 2#isk partitions which are to be used as raw devices are not logically

    formatted. This saves the overhead and disk space of the filesystem structure, but reuires

    that the application program manage its own disk storage reuirements3.

    -.6 7!!# 7l!ck 

    )omputer 0$7 contains a bootstrap program 2$% independent3 with ust enough

    code to find the first sector on the first hard drive on the first controller, load that sector 

    into memory, and transfer control over to it. 2The 0$7 bootstrap program may look in

    floppy and/or )# drives before accessing the hard drive, and is smart enough to

    recogniGe whether it has found valid boot code or not.3. The first sector on the hard drive

    is known as the 7aster !oot 0ecord, 7!0, and contains a very small amount of code in

    addition to the partition table. The partition table documents how the disk is partitioned

    into logical disks, and indicates specifically which partition is the active or boot partition.

    The boot program then looks to the active partition to find an operating system, possibly

    loading up a slightly larger / more advanced boot program along the way. In a dualboot

    2or larger multiboot3 system, the user may be given a choice of which operating system

    to boot, with a default action to be taken in the event of no response within some time

    frame. $nce the kernel is found by the boot program, it is loaded into memory and then

    control is transferred over to the $%. The kernel will normally continue the boot process

     by initialiGing all important kernel data structures, launching important system services

    and finally providing one or more login prompts. !oot options at this stage may include

    singleuser a.k.a. maintenance or safe modes, in which very few system services are

    started. These modes are designed for system administrators to repair problems or otherwise maintain the system.

    ;.< !oot !lock 

    19

  • 8/19/2019 disk scheduling algorithms in os

    20/32

    -.8 7)d 7l!cks

     4o disk can be manufactured to -@@J perfection, and all physical obects wear

    out over time. 'or these reasons all disks are shipped with a few bad blocks, and

    additional blocks can be expected to go bad slowly over time. If a large number of blocks

    go bad then the entire disk will need to be replaced, but a few here and there can be

    handled through other means. In the old days, bad blocks had to be checked for manually.

    'ormatting of the disk or running certain diskanalysis tools would identify bad blocks,

    and attempt to read the data off of them one last time through repeated tries. Then the bad

     blocks would be mapped out and taken out of future service. %ometimes the data could be

    recovered, and sometimes it was lost forever. 2#isk analysis tools could be either

    destructive or nondestructive3. 7odern disk controllers make much better use of the

    errorcorrecting codes, so that bad blocks can be detected earlier and the data usually

    recovered. 20ecall that blocks are tested with every write as well as with every read, so

    often errors can be detected before the write operation is complete, and the data simply

    written to a different sector instead3. 4ote that remapping of sectors from their normallinear progression can throw off the disk scheduling optimiGation of the $%, especially if

    the replacement sector is physically far away from the sector it is replacing. 'or this

    reason most disks normally keep a few spare sectors on each cylinder, as well as at least

    one spare cylinder. Whenever possible a bad sector will be mapped to another sector on

    the same cylinder, or at least a cylinder as close as possible. %ector slipping may also be

     performed, in which all sectors between the bad sector and the replacement sector are

    moved down by one, so that the linear progression of sector numbers can be maintained.

    If the data on a bad block cannot be recovered, then a hard error has occurred., which

    reuires replacing the file2s3 from backups, or rebuilding them from scratch.

    20

  • 8/19/2019 disk scheduling algorithms in os

    21/32

    CHAPTER /

    RAID

    /.1 RAID S#"uc#u"e

    The general idea behind 0AI# is to employ a group of hard drives together with

    some form of duplication, either to increase reliability or to speed up operations, 2 or 

    sometimes both3. 0AI# originally stood for 0edundant Array of Inexpensive #isks, and

    was designed to use a bunch of cheap small disks in place of one or two larger more

    expensive ones. Today 0AI# systems employ large possibly expensive disks as their 

    components, switching the definition to Independent disks.

    /.2 I$+"!e$en# ! Reli)9ili#: i) Redund)nc:

    The more disks a system has, the greater the likelihood that one of them will go

     bad at any given time. "ence increasing disks on a system actually decreases the 7ean

    Time To 'ailure, 7TT' of the system. If, however, the same data was copied onto

    multiple disks, then the data would not be lost unless both copies of the data were

    damaged simultaneously, which is a 7+)" lower probability than for a single disk going bad. 7ore specifically, the second disk would have to go bad before the first disk was

    repaired, which brings the 7ean Time To 0epair into play. 'or example if two disks were

    involved, each with a 7TT' of -@@,@@@ hours and a 7TT0 of -@ hours, then the 7ean

    Time to #ata 5oss would be ?@@ K -@L> hours, or ?

  • 8/19/2019 disk scheduling algorithms in os

    22/32

    /.- I$+"!e$en# in Pe"!"$)nce i) P)")llelis$

    There is also a performance benefit to mirroring, particularly with respect to reads.

    %ince every block of data is duplicated on multiple disks, read operations can be satisfied

    from any available copy, and multiple disks can be reading different data blocks

    simultaneously in parallel. 2Writes could possibly be sped up as well through careful

    scheduling algorithms, but it would be complicated in practice3. Another way of 

    improving disk access time is with striping, which basically means spreading data out

    across multiple disks that can be accessed simultaneously. With bitlevel striping the bits

    of each byte are striped across multiple disks. 'or example if : disks were involved, then

    each :bit byte would be read in parallel by : heads on separate disks. A single disk read

    would access : K ?- bytes =6 worth of data in the time normally reuired to read ?-

     bytes. %imilarly if = disks were involved, then two bits of each byte could be stored oneach disk, for 6 worth of disk access per read or write operation. !locklevel striping

    spreads a filesystem across multiple disks on a blockbyblock basis, so if block 4 were

    located on disk @, then block 4 M - would be on disk -, and so on. This is particularly

    useful when filesystems are accessed in clusters of physical blocks. $ther striping

     possibilities exist, with blocklevel striping being the most common.

    /./ RAID Leels7irroring provides reliability but is expensiveC %triping improves performance,

     but does not improve reliability. Accordingly there are a number of different schemes that

    combine the principals of mirroring and striping in different ways, in order to balance

    reliability versus performance versus cost. These are described by different 0AI# levels,

    as follows1

    0aid 5evel @ This level includes striping only, with no mirroring.

    0aid 5evel - This level includes mirroring only, no striping.

    0aid 5evel This level stores errorcorrecting codes on additional disks, allowing for 

    any damaged data to be reconstructed by subtraction from the remaining undamaged data.

     4ote that this scheme reuires only three extra disks to protect = disks worth of data, as

    opposed to full mirroring. 2The number of disks reuired is a function of the error

    correcting algorithms, and the means by which the particular bad bit2s3 is2are3 identified3.

    22

  • 8/19/2019 disk scheduling algorithms in os

    23/32

    0aid 5evel ; This level is similar to level , except that it takes advantage of the fact

    that each disk is still doing its own errordetection, so that when an error occurs, there is

    no uestion about which disk in the array has the bad data. As a result a single parity bit is

    all that is needed to recover the lost data from an array of disks. 5evel ; also includes

    striping, which improves performance. The downside with the parity approach is thatevery disk must take part in every disk access, and the parity bits must be constantly

    calculated and checked, reducing performance. "ardwarelevel parity calculations and

     40A7 cache can help with both of those issues. In practice level ; is greatly preferred

    over level .

    0aid 5evel = This level is similar to level ;, employing blocklevel striping instead of 

     bitlevel striping. The benefits are that multiple blocks can be read independently, and

    changes to a block only reuire writing two blocks 2data and parity3 rather than involving

    all disks. 4ote that new disks can be added seamlessly to the system provided they are

    initialiGed to all Geros, as this does not affect the parity results.

    0aid 5evel ? This level is similar to level =, except the parity blocks are distributed over 

    all disks, thereby more evenly balancing the load on the system. 'or any given block on

    the disk2s3, one of the disks will hold the parity information for that block and the other 

     4- disks will hold the data. 4ote that the same disk cannot hold both data and parity for 

    the same block, as both would be lost in the event of a disk crash.

    0aid 5evel > This level extends raid level ? by storing multiple bits of errorrecovery

    codes, 2such as the 0eed%olomon codes3, for each bit position of data, rather than a

    single parity bit. In the example shown below bits of &)) are stored for every = bits of 

    data, allowing data recovery in the face of up to two simultaneous disk failures. 4ote that

    this still involves only ?@J increase in storage needs, as opposed to -@@J for simple

    mirroring which could only tolerate a single disk failure.

    23

  • 8/19/2019 disk scheduling algorithms in os

    24/32

    =.= 2a3 0AI# levels

    There are also two 0AI# levels which combine 0AI# levels @ and - 2striping and

    mirroring3 in different combinations, designed to provide both performance and reliability

    at the expense of increased cost.

    0AI# level @ M - disks are first striped, and then the striped disks mirrored to another set.

    This level generally provides better performance than 0AI# level ?.

    0AI# level - M @ mirrors disks in pairs, and then stripes the mirrored pairs. The storage

    capacity, performance, etc. are all the same, but there is an advantage to this approach inthe event of multiple disk failures, as illustrated below.

    24

  • 8/19/2019 disk scheduling algorithms in os

    25/32

    In diagram 2a3 below, the : disks have been divided into two sets of four, each of which is

    striped, and then one stripe set is used to mirror the other set. If a single disk fails, it

    wipes out the entire stripe set, but the system can keep on functioning using the remaining

    set.

    "owever if a second disk from the other stripe set now fails, then the entire system is lost,as a result of two disk failures. In diagram 2b3, the same : disks are divided into four sets

    of two, each of which is mirrored, and then the file system is striped across the four sets

    of mirrored disks. If a single disk fails, then that mirror set is reduced to a single disk, but

    the system rolls on, and the other three mirror sets continue mirroring.

     4ow if a second disk fails, 2that is not the mirror of the already failed disk3, then another 

    one of the mirror sets is reduced to a single disk, but the system can continue without data

    loss. In fact the second arrangement could handle as many as four simultaneously failed

    disks, as long as no two of them were from the same mirror pair.

    =.= 2b3 0AI# @ M - and - M @

    25

  • 8/19/2019 disk scheduling algorithms in os

    26/32

    /.4 P"!9le$s 3i#h RAID

    0AI# protects against physical errors, but not against any number of bugs or other 

    errors that could write erroneous data. N'% adds an extra level of protection by including

    data block checksums in all inodes along with the pointers to the data blocks. If data are

    mirrored and one copy has the correct checksum and the other does not, then the data

    with the bad checksum will be replaced with a copy of the data with the good checksum.

    This increases reliability greatly over 0AI# alone, at a cost of a performance hit that is

    acceptable because N'% is so fast to begin with. Another problem with traditional

    filesystems is that the siGes are fixed, and relatively difficult to change. Where 0AI# sets

    are involved it becomes even harder to adust filesystem siGes, because a filesystem

    cannot span across multiple filesystems. N'% solves these problems by pooling 0AI#sets, and by dynamically allocating space to filesystems as needed. 'ilesystem siGes can

     be limited by uotas, and space can also be reserved to guarantee that a filesystem will be

    able to grow later, but these parameters can be changed at any time by the filesystemOs

    owner. $therwise filesystems grow and shrink dynamically as needed.

    26

  • 8/19/2019 disk scheduling algorithms in os

    27/32

    CHAPTER 4

    TERTIAR STORAGE

    4.1 Te"#i)": S#!")ge S#"uc#u"e

    *rimary storage refers to computer memory chipsC %econdary storage refers to

    fixeddisk storage systems 2hard drives3C And Tertiary %torage refers to removable media,

    such as tape drives, )#s, ##s, and to a lesser extend floppies, thumb drives, and other 

    detachable devices. Tertiary storage is typically characteriGed by large capacity, low cost

     per 7!, and slow access times, although there are exceptions in any of these categories.

    Tertiary storage is typically used for backups and for longterm archival storage of 

    completed work. Another common use for tertiary storage is to swap large littleused files

    2or groups of files3 off of the hard drive, and then swap them back in as needed in a

    fashion similar to secondary storage providing swap space for primary storage

    4.2 Re$!)9le Disks

    0emovable magnetic disks 2e.g. floppies3 can be nearly as fast as hard drives, but

    are at greater risk for damage due to scratches. ariations of removable magnetic disks up

    to a E! or more in capacity have been developed. 2"otswappable hard drives3. A

    magnetooptical disk uses a magnetic disk covered in a clear plastic coating that protects

    the surface. The heads sit a considerable distance away from the magnetic surface, and asa result do not have enough magnetic strength to switch bits at normal room temperature.

    'or writing, a laser is used to heat up a specific spot on the disk, to a temperature at which

    the weak magnetic field of the write head is able to flip the bits. 'or reading, a laser is

    shined at the disk, and the 6err effect causes the polariGation of the light to become

    rotated either clockwise or counterclockwise depending on the orientation of the

    magnetic field. $ptical disks do not use magnetism at all, but instead use special materials

    that can be altered 2by lasers3 to have relatively light or dark spots. 'or example the

     phasechange disk has a material that can be froGen into either a crystalline or an

    amorphous state, the latter of which is less transparent and reflects less light when a laser is bounced off a reflective surface under the material. Three powers of lasers are used

    with phasechange disks1 2-3 a low power laser is used to read the disk, without effecting

    the materials. 23 A medium power erases the disk, by melting and refreeGing the

    medium into a crystalline state, and 2;3 a high power writes to the disk by melting the

    medium and refreeGing it into the amorphous state. The most common examples of these

    disks are rewritable )#0Ws and ##0Ws. An alternative to the disks described

    above are Write$nce 0ead7any, W$07 drives. The original version of W$07 drives

    involved a thin layer of aluminum sandwiched between two protective layers of glass or 

     plastic. "oles were burned in the aluminum to write bits. !ecause the holes could not be

    filled back in, there was no way to rewrite to the disk. 2Although data could be erased by

     burning more holes3. W$07 drives have important legal ramifications for data that must

    27

  • 8/19/2019 disk scheduling algorithms in os

    28/32

     be stored for a very long time and must be provable in court as unaltered since it was

    originally written. 2%uch as longterm storage of medical records.3 7odern )#0 and

    ##0 disks are examples of W$07 drives that use organic polymer inks instead of an

    aluminum layer. 0eadonly disks are similar to W$07 disks, except the bits are pressed

    onto the disk at the factory, rather than being burned on one by one.

    4.- T)+es

    Tape drives typically cost more than disk drives, but the cost per 7! of the tapes

    themselves is lower. Tapes are typically used today for backups, and for enormous

    volumes of data stored by certain scientific establishments. 2&.g. 4A%AOs archive of space

     probe and satellite imagery, which is currently being downloaded from numerous sources

    faster than anyone can actually look at it.3 0obotic tape changers move tapes from drives

    to archival tape libraries upon demand.

    4./ O+e")#ing0S:s#e$ Su++!"#

    The $% must provide support for tertiary storage as removable media, including

    the support to transfer data between different systems

    4./.1 A++lic)#i!n In#e")ce

    'ile systems are typically not stored on tapes. 2It might be technically possible,

     but it is impractical3. Tapes are also not lowlevel formatted, and do not use fixedlength

     blocks. 0ather data is written to tapes in variable length blocks as needed. Tapes are

    normally accessed as raw devices, reuiring each application to determine how the data is

    to be stored and read back. Issues such as header contents and A%)II versus binary

    encoding 2and byteordering3 are generally application specific. !asic operations

    supported for tapes include locate2 3, read2 3, write2 3, and readPposition2 3. 2!ecause of 

    variable length writes3, writing to a tape erases all data that follows that point on the tape.

    Writing to a tape places the &nd of Tape 2&$T3 marker at the end of the data written. It is

    not possible to locate2 3 to any spot past the &$T marker.

    4./.2 %ile N)$ing

    'ile naming conventions for removable media are not entirely uniuely specific,

    nor are they necessarily consistent between different systems. 2Two removable disks may

    contain files with the same name, and there is no clear way for the naming system to

    distinguish between them3. 'ortunately music )#s have a common format, readable by

    all systems. #ata )#s and ##s have only a few format choices, making it easy for a

    system to support all known formats.

    28

  • 8/19/2019 disk scheduling algorithms in os

    29/32

    4./.- Hie")"chic)l S#!")ge )n)ge$en#

    "ierarchical storage involves extending file systems out onto tertiary storage,

    swapping files from hard drives to tapes in much the same manner as data blocks are

    swapped from memory to hard drives. A placeholder is generally left on the hard drive,

    storing information about the particular tape 2or other removable media3 on which the file

    has been swapped out to. A robotic system transfers data to and from tertiary storage as

    needed, generally automatically upon demand of the file2s3 involved.

    4.4 Pe"!"$)nce Issues

    There are certain performance issues that are to be considered such as speed, cost,

    reliability.

    4.4.1 S+eed

    %ustained !andwidth is the rate of data transfer during a large file transfer, once

    the proper tape is loaded and the file located. &ffective !andwidth is the effective overall

    rate of data transfer, including any overhead necessary to load the proper tape and find the

    file on the tape. Access 5atency is all of the accumulated waiting time before a file can be

    actually read from tape. This includes the time it takes to find the file on the tape, the time

    to load the tape from the tape library, and the time spent waiting in the ueue for the tape

    drive to become available. )learly tertiary storage access is much slower than secondary

    access, although removable disks 2e.g. a )# ukebox3 have somewhat faster access than a

    tape library.

    4.4.2 Reli)9ili#:

    'ixed hard drives are generally more reliable than removable drives, because they

    are less susceptible to the environment. $ptical disks are generally more reliable than

    magnetic media. A fixed hard drive crash can destroy all data, whereas an optical drive or 

    tape drive failure will often not harm the data media, 2and certainly canOt damage any

    media not in the drive at the time of the failure3. Tape drives are mechanical devices, and

    can wear out tapes over time, 2as the tape head is generally in much closer physical

    contact with the tape than disk heads are with platters3. %ome drives may only be able to

    read tapes a few times whereas other drives may be able to reuse the same tapes millions

    of times. !ackup tapes should be read after writing, to verify that the backup tape is

    readable. 5ongterm tape storage can cause degradation, as magnetic fields HdriftH from

    29

  • 8/19/2019 disk scheduling algorithms in os

    30/32

    one layer of tape to the adacent layers. *eriodic fastforwarding and rewinding of tapes

    can help, by changing which section of tape lays against which other layers.

    4.4.- C!s#

    The cost per megabyte for removable media is its strongest selling feature,

     particularly as the amount of storage involved 2i.e. the number of tapes, )#s, etc3

    increases. "owever the cost per megabyte for hard drives has dropped more rapidly over 

    the years than the cost of removable media, such that the currently most costeffective

     backup solution for many systems is simply an additional 2external3 hard drive.

    30

  • 8/19/2019 disk scheduling algorithms in os

    31/32

    CHAPTER 5

    CONCLUSION

    #isk %cheduling algorithms play important role in the operating systems. ItminimiGes the seek time also decreases the seek time which in turn decreases the seek 

    distance. It provides the fast access time.

     *erformance depends on the number of reuests and %)A4 and )%)A4 are for 

    systems that place a heavy load on the disk. #efault algorithms used are %%T' and

    5$$6. #isk scheduling algorithms are used to allocate the services to the I/$ reuests on

    the disk and improve its performance.

    31

  • 8/19/2019 disk scheduling algorithms in os

    32/32

    RE%ERENCES

    Q-R $perating %ystem by Ealvin

    QR William %tallings, $perating %ystem, @@<

    Q;R en.wikipedia.org

    Q=R www.cs.it.edu

    Q?R www.cs.uregina.ca