10
Introduction Disk Scheduling RAID SMD149 - Operating Systems - Disk Scheduling Roland Parviainen November 18, 2005 1 / 64 Introduction Disk Scheduling RAID Overview Outline Introduction Disk scheduling Other methods RAID 2 / 64 Introduction Disk Scheduling RAID Overview Introduction Processor and memory speeds increases faster than secondary storage Hard drive speeds improve slower than their capacity Long service delays for I/O bound processes Need optimizations 3 / 64 Introduction Disk Scheduling RAID Overview Hard drive history Important milestones Punch cards and paper tape 1951 - Magnetic tape - Sequential access 1956 - first commercial hard disk, the IBM 350 RAMAC disk drive, 5 megabyte 1973 - IBM introduced the 3340 ”Winchester” disk system (two 30MB spindles) First to use a sealed head/disk assembly (HDA) Winchester was used to described hard disks until 90s 1980 - first 5.25-inch Winchester drive, the Seagate ST-506, 5 megabyte 1991 - 100 megabyte hard drive 1995 - 2 gigabyte hard drive 1997 - 10 gigabyte hard drive 2005 - 500 gigabyte hard drive 4 / 64 Introduction Disk Scheduling RAID Overview Characteristics of Moving-head Disk Storage Variable access speed Depends on the location of data, position of read-write head Magnetic disks, platters rotating spindle (thousands of RPM) Read-write head attached to actuator (boom, moving arm assembly) Head is above a circular track Vertical set of tracks: cylinder Seeking: moving to a cylinder 5 / 64 Introduction Disk Scheduling RAID Overview 6 / 64

Introduction Disk Scheduling Overview Outline SMD149 - Operating Systems - Disk Scheduling · 2005-11-18 · Introduction Disk Scheduling RAID Overview 7/64 Introduction Disk Scheduling

  • Upload
    others

  • View
    12

  • Download
    0

Embed Size (px)

Citation preview

IntroductionDisk Scheduling

RAID

SMD149 - Operating Systems - Disk Scheduling

Roland Parviainen

November 18, 2005

1 / 64

IntroductionDisk Scheduling

RAIDOverview

Outline

Introduction

Disk scheduling

Other methods

RAID

2 / 64

IntroductionDisk Scheduling

RAIDOverview

Introduction

Processor and memory speeds increases faster than secondarystorage

Hard drive speeds improve slower than their capacity

Long service delays for I/O bound processes

Need optimizations

3 / 64

IntroductionDisk Scheduling

RAIDOverview

Hard drive history

Important milestones

Punch cards and paper tape

1951 - Magnetic tape - Sequential access

1956 - first commercial hard disk, the IBM 350 RAMAC disk drive, 5megabyte

1973 - IBM introduced the 3340 ”Winchester” disk system (two30MB spindles)

First to use a sealed head/disk assembly (HDA)Winchester was used to described hard disks until 90s

1980 - first 5.25-inch Winchester drive, the Seagate ST-506, 5megabyte

1991 - 100 megabyte hard drive

1995 - 2 gigabyte hard drive

1997 - 10 gigabyte hard drive

2005 - 500 gigabyte hard drive

4 / 64

IntroductionDisk Scheduling

RAIDOverview

Characteristics of Moving-head Disk Storage

Variable access speed

Depends on the location of data, position of read-write head

Magnetic disks, platters

rotating spindle (thousands of RPM)

Read-write head

attached to actuator (boom, moving arm assembly)

Head is above a circular track

Vertical set of tracks: cylinder

Seeking: moving to a cylinder

5 / 64

IntroductionDisk Scheduling

RAIDOverview

6 / 64

IntroductionDisk Scheduling

RAIDOverview

7 / 64

IntroductionDisk Scheduling

RAIDOverview

Accessing data

To access a particular record:Seek operation, move the arm to the correct cylinder

Seek time

Rotate disk so data is under head

Rotational latency time

Data spins by head

Transmission time

8 / 64

IntroductionDisk Scheduling

RAIDOverview

9 / 64

IntroductionDisk Scheduling

RAIDOverview

10 / 64

IntroductionDisk Scheduling

RAIDOverview

Data

Rotational speed: 4200-15000 rpm

Number of I/O operations per second: around 50 random or 100sequential OPS

Transfer Rate

Inner Zone: from 44.2MB/sec to 74.5MB/secOuter Zone: from 74.0MB/sec to 111.4MB/sec

Random access time: from 5ms to 15ms

11 / 64

IntroductionDisk Scheduling

RAID

Why disk scheduling?

Processes generate requests simultaneously

Early systems: FCFS, First Come, First Served

FairHigh request rate - long waiting timesRandom seek patternArm might move from one end to the otherBetter to reorder requests?

Disk scheduling

Looks at physical postion of requested recordsAvoid mechanical motionSeek optimization, rotational optimization

12 / 64

IntroductionDisk Scheduling

RAID

Strategies

Criteria

ThroughputMean response timeVariance of response time

13 / 64

IntroductionDisk Scheduling

RAID

14 / 64

IntroductionDisk Scheduling

RAID

FCFS

15 / 64

IntroductionDisk Scheduling

RAID

SSTF

Shortest Seek Time First

Service request closest to read-write head

Advantages

Higher throughput and lower response times than FCFSReasonable solution for batch processing systems

Disadvantages

Does not ensure fairnessPossibility of indefinite postponementHigh variance of response timesResponse time generally unacceptable for interactive systems

16 / 64

IntroductionDisk Scheduling

RAID

SSTF

17 / 64

IntroductionDisk Scheduling

RAID

SCAN

SCAN

Shortest seek time in preferred direction

Aims to reduce unfairness and variance of SSTF response timesDoes not change direction until edge of disk reachedSimilar characteristics to SSTFIndefinite postponement still possible

18 / 64

IntroductionDisk Scheduling

RAID

SCAN

19 / 64

IntroductionDisk Scheduling

RAID

C-SCAN

C-SCAN

Similar to SCAN, but at the end of an inward sweep, the disk armjumps (without servicing requests) to the outermost cylinder

Further reduces variance of response times at the expense ofthroughput and mean response times

20 / 64

IntroductionDisk Scheduling

RAID

C-SCAN

21 / 64

IntroductionDisk Scheduling

RAID

FSCAN and N-Step SCAN

FSCAN and N-Step SCAN

Group requests into batches

FSCAN: “freeze” the disk request queue periodically, service onlythose requests in the queue at that time

N-Step SCAN: Service only the first N requests in the queue at atime

N = 1: FCSC, N = infinite: SCAN

Advantages

Both strategies prevent indefinite postponementBoth reduce variance of response times compared to SCAN

22 / 64

IntroductionDisk Scheduling

RAID

FSCAN

23 / 64

IntroductionDisk Scheduling

RAID

N-Step SCAN, N=3

24 / 64

IntroductionDisk Scheduling

RAID

LOOK and C-LOOK

LOOK

Improvement on SCAN scheduling

Only performs sweeps large enough to service all requests

Does move the disk arm to the outer edges of the disk if no requestsfor those regions are pendingImproves efficiency by avoiding unnecessary seek operationsHigh throughput

C-LOOK

Improvement on C-SCAN scheduling

Combination of LOOK and C-SCANLower variance of response times than LOOK, at the expense ofthroughput

25 / 64

IntroductionDisk Scheduling

RAID

26 / 64

IntroductionDisk Scheduling

RAID

Summary

27 / 64

IntroductionDisk Scheduling

RAID

Rotational Optimization

Seek time formerly dominated performance concerns

Seek times and rotational latency are the same order of magnitude

Optimization by reducing rotational latency?Important when accessing small pieces of data distributedthroughout the disk surfaces

28 / 64

IntroductionDisk Scheduling

RAID

SLTF

Shortest Latency Time First

On a given cylinder, service request with shortest rotational latencyfirst

Easy to implement

Achieves near-optimal performance for rotational latency

29 / 64

IntroductionDisk Scheduling

RAID

30 / 64

IntroductionDisk Scheduling

RAID

SPTF and SATF

Shortest Position Time First

Positioning time: Sum of seek time and rotational latency

SPTF first services the request with the shortest positioning time

Yields good performance

Can indefinitely postpone requests

Shortest Access Time First

Access time: positioning time plus transmission time

High throughput

Again, possible to indefinitely postpone requests

Both SPTF and SATF can implement LOOK to improveperformance

Weakness

Both SPTF and SATF require knowledge of disk performancecharacteristics which might not be readily available

Increase rotational speed? 31 / 64

IntroductionDisk Scheduling

RAID

Hard drive geometry

Hard drives sometimes report wrong geometry

Error correction, spare sectors, etc.

Sometimes, true geomtry is available

Problem for disk scheduling algorithms

32 / 64

IntroductionDisk Scheduling

RAID

33 / 64

IntroductionDisk Scheduling

RAID

Examples

Linux

Default: elevator algorithm (LOOK variation of SCAN)

Can suffer from indefenite postponment

Deadline and anticipatory scheduling (LOOK)

DeadlineTwo FIFO queues (read and write requests)

References to requests, with deadline

Head of queues close to deadlineReads: 500ms, Writes: 5sService requests that have expired, together with requests thatalmost expires

34 / 64

IntroductionDisk Scheduling

RAID

Caching and Buffering

Disk cache buffer

Cache for disk data

Buffer for data

To delay writing of data

Need replacement strategies

Memory usage?

System failure with modified buffer?

Write-back caching

Write-through caching

Hard drives have their own buffer cache

35 / 64

IntroductionDisk Scheduling

RAID

Other disk performance techniques

Fragmentation/degfragmentation

Place files that will be modified near free space

SCAN visits midrange more often - place often reference data there

Compression

When disk is idle, position the head correctly

36 / 64

IntroductionDisk Scheduling

RAID

RAID

Redundant Array of Independent Disks

Use several disks to improve

Capacity, reliability, speed, or a combination

David A. Patterson, Garth A. Gibson and Randy H. Katz, ”A Casefor Redundant Arrays of Inexpensive Disks (RAID)”, SIGMODConference 1988

Combine multiple drives into one logical unit

Hardware and/or Software

Different RAID levels

37 / 64

IntroductionDisk Scheduling

RAID

Data striping and strips

Strips forms

Stripes

Strips

Fixed size blocks (bit, byte, blocks)

38 / 64

IntroductionDisk Scheduling

RAID

39 / 64

IntroductionDisk Scheduling

RAID

RAID 0

RAID 0 - Striped set

Not one of the original levels - no redundancy

Splits data evenly across two or more disks

Reliability decreases fast

Block size typically multiple of disk sector size

Each drive can seek independentlyFast seek timesTransfer speed: sum of drive speeds

40 / 64

IntroductionDisk Scheduling

RAID

RAID 0

41 / 64

IntroductionDisk Scheduling

RAID

RAID 1

RAID 1 - Mirroring

Disk mirroring for redundancy

Each disk is duplicated

Reads can be served simultaneously for a pair

Writes one at a time (writes to both disks)

Half the capacity

Multiple disk failures possible

Data generation

Can be done onlineHot swapping

42 / 64

IntroductionDisk Scheduling

RAID

RAID 1

43 / 64

IntroductionDisk Scheduling

RAID

RAID 2

RAID 2 (Bit-level Hamming ECC Parity

Striped at the bit-level

Hamming error correcting codes (Hamming ECCs)

Detect up to two errors, correct one

Parity disks: log(data disks) (10 - 4, 25 - 5)

Writes requires writing the parity - and thus reading the completestripe

Read-modify-write-cycle

Read requests read full stripe (compute parity)

Sometimes ignored

Not used

44 / 64

IntroductionDisk Scheduling

RAID

RAID 2

45 / 64

IntroductionDisk Scheduling

RAID

XOR Parity

One parity block for the set of blocks

Parity calulation: A1 XOR A2 XOR A3 = Ap

Recovery: A1 XOR A2 XOR Ap = A3

One data block can be recovered

46 / 64

IntroductionDisk Scheduling

RAID

RAID 3

RAID 3 (Bit/byte level XOR ECC parity)

Bit or byte level striping

XOR ECC

One disk for parity

Can recover from a single disk failure

Recovery expensive

Most reads access full array

One write at a time

High transfer rate for single file

47 / 64

IntroductionDisk Scheduling

RAID

RAID 3

48 / 64

IntroductionDisk Scheduling

RAID

RAID 4

RAID 4 (Block level XOR ECC Parity)

Blocks instead of bits/bytes

Higher concurrency than level 3

Parity calculation easier

Ap’ = (Ad XOR Ad’) XOR Ap

Single write at a time - parity must always be written

49 / 64

IntroductionDisk Scheduling

RAID

RAID 4

50 / 64

IntroductionDisk Scheduling

RAID

RAID 5

RAID 5 (Block-level Distributed XOR ECC Parity)

Removes bottleneck from level 4

Parity blocks distributed among the array

Still requires read-modify-write cycle for write requests (4 ops)

Parity logging: parity difference stored in memoryAFRAID: parity calculation is done when load is light

Maximum number of drives unlimited

Common practice: 14 or less

Two drives (no recovery) have a high probability of failureHigh probability that second drive fails before first failure is detected,replaced and recreated

51 / 64

IntroductionDisk Scheduling

RAID

52 / 64

IntroductionDisk Scheduling

RAID

RAID 6

RAID 6

Extends RAID 5 with one more parity block

One XOR block and one Reed-Solomon block

Can handle two disk failures

Inefficient with small number of disks

Traditional Typical

RAID 5 RAID 6

A1 A2 A3 Ap A1 A2 A3 Ap Aq

B1 B2 Bp B3 B1 B2 Bp Bq B3

C1 Cp C2 C3 C1 Cp Cq C2 C3

Dp D1 D2 D3 Dp Dq D1 D2 D3

54 / 64

IntroductionDisk Scheduling

RAID

RAID 7

RAID 7

Storage Computer Corporation

Adds caching to RAID 3, 4

55 / 64

IntroductionDisk Scheduling

RAID

RAID 01

RAID 0+1, 01

A mirror of stripes

RAID 1 above several RAID 0 arrays

RAID 1

/--------------------------\

| |

RAID 0 RAID 0

/-----------------\ /-----------------\

| | | | | |

120 GB 120 GB 120 GB 120 GB 120 GB 120 GB

A1 A2 A3 A1 A2 A3

A4 A5 A6 A4 A5 A6

B1 B2 B3 B1 B2 B3

B4 B5 B6 B4 B5 B6

57 / 64

IntroductionDisk Scheduling

RAID

RAID 10

RAID 10

A stripe of mirrors

RAID 0 above several RAID 1 arrays

One drive from each RAID 1 set can fail

Fast write speeds

RAID 0

/-----------------------------------\

| | |

RAID 1 RAID 1 RAID 1

/--------\ /--------\ /--------\

| | | | | |

120 GB 120 GB 120 GB 120 GB 120 GB 120 GB

A1 A1 A2 A2 A3 A3

A4 A4 A5 A5 A6 A6

B1 B1 B2 B2 B3 B3

B4 B4 B5 B5 B6 B659 / 64

IntroductionDisk Scheduling

RAID

RAID 50

RAID 50, 5+0

RAID 0 above several RAID 5 arrays

Higher performance that RAID 5

RAID 0

/-------------------------------------------------\

| | |

RAID 5 RAID 5 RAID 5

/-----------------\ /-----------------\ /-----------------\

| | | | | | | | |

120 GB 120 GB 120 GB 120 GB 120 GB 120 GB 120 GB 120 GB 120 GB

A1 A2 Ap A3 A4 Ap A5 A6 Ap

B1 Bp B2 B3 Bp B4 B5 Bp B6

Cp C1 C2 Cp C3 C4 Cp C5 C6

D1 D2 Dp D3 D4 Dp D5 D6 Dp

61 / 64

IntroductionDisk Scheduling

RAID

RAID summary

62 / 64

IntroductionDisk Scheduling

RAID

Summary

Next: File systems

63 / 64

IntroductionDisk Scheduling

RAID

Sources

Course booken.wikipedia.org

64 / 64