59
RAID Systems Ver.2.0 Jan 09, 2005 Syam

RAID Systems Ver.2.0 Jan 09, 2005 Syam. RAID Primer Redundant Array of Inexpensive Disks random, real-time, redundant, array, assembly, interconnected,

Embed Size (px)

Citation preview

Page 1: RAID Systems Ver.2.0 Jan 09, 2005 Syam. RAID Primer Redundant Array of Inexpensive Disks random, real-time, redundant, array, assembly, interconnected,

RAID SystemsVer.2.0

Jan 09, 2005Syam

Page 2: RAID Systems Ver.2.0 Jan 09, 2005 Syam. RAID Primer Redundant Array of Inexpensive Disks random, real-time, redundant, array, assembly, interconnected,

RAID Primer Redundant Array of Inexpensive Disks

random, real-time, redundant, array, assembly, interconnected, independent, inter-relation, devices , drives, etc.

Physical Drive Actual Hard Disks

Physical Array One or more physical drive

Logical Array Formed by splitting or combining physical

arrays Logical Drive

One or more logical array

Page 3: RAID Systems Ver.2.0 Jan 09, 2005 Syam. RAID Primer Redundant Array of Inexpensive Disks random, real-time, redundant, array, assembly, interconnected,

Drive Layout

Physical Drive 0

Physical Array 0 Physical Array 1

Logical Array 2

Logical Drive

Logical Array 0Logical Array 1

Page 4: RAID Systems Ver.2.0 Jan 09, 2005 Syam. RAID Primer Redundant Array of Inexpensive Disks random, real-time, redundant, array, assembly, interconnected,

Why RAID

CPU and Memory Disk I/O Disk Reliability

MTBF (Mean Time Between Failures) = MTBF Single Drive / # Drives Fault Tolerance

Use multiple small, inexpensive disks into array which yields performance exceeding that of Single Large Expensive Disk

Page 5: RAID Systems Ver.2.0 Jan 09, 2005 Syam. RAID Primer Redundant Array of Inexpensive Disks random, real-time, redundant, array, assembly, interconnected,

RAID Benefits

Higher Data Security Fault Tolerance Improved Availability Increased, Integrated Capacity Improved Performance

Page 6: RAID Systems Ver.2.0 Jan 09, 2005 Syam. RAID Primer Redundant Array of Inexpensive Disks random, real-time, redundant, array, assembly, interconnected,

RAID Costs

Planning and Design Hardware Software Setup and Training Maintenance

Page 7: RAID Systems Ver.2.0 Jan 09, 2005 Syam. RAID Primer Redundant Array of Inexpensive Disks random, real-time, redundant, array, assembly, interconnected,

RAID Tradeoffs

Page 8: RAID Systems Ver.2.0 Jan 09, 2005 Syam. RAID Primer Redundant Array of Inexpensive Disks random, real-time, redundant, array, assembly, interconnected,

RAID Misconceptions

Blanket Statements “Invulnerability Complex” AID 0 not RAID 0

Page 9: RAID Systems Ver.2.0 Jan 09, 2005 Syam. RAID Primer Redundant Array of Inexpensive Disks random, real-time, redundant, array, assembly, interconnected,

Hardware vs. Software RAID Hardware RAID

Hardware manages the RAID independently from the host and gives the host a single drive per array

Controller operates simultaneously with system

Highly fault tolerant

Software RAID Software manages RAID Lives in host memory and consumes CPU

cycles Array only functional when array software is

loaded (array down, array software load?)

Page 10: RAID Systems Ver.2.0 Jan 09, 2005 Syam. RAID Primer Redundant Array of Inexpensive Disks random, real-time, redundant, array, assembly, interconnected,

Hardware vs. Software RAID

Hardware RAID Software RAID

Implementation

Dedicated Hardware

Software Kernel

Automatic Failover

Yes Yes

How Swap Yes No

CostHundreds of Dollars

None

CPU Impact Negligible Typically 5-15%

Page 11: RAID Systems Ver.2.0 Jan 09, 2005 Syam. RAID Primer Redundant Array of Inexpensive Disks random, real-time, redundant, array, assembly, interconnected,

RAID Concepts

Mirroring – One method of data redundancy Data written simultaneously to

two hard disks 100% redundancy protects against

failure of any of the disks

My DATA

My DATA

Page 12: RAID Systems Ver.2.0 Jan 09, 2005 Syam. RAID Primer Redundant Array of Inexpensive Disks random, real-time, redundant, array, assembly, interconnected,

RAID Concepts

Striping – Disks used in parallel Each drive partitioned into stripes

from one sector (512 bytes) to several MBs

Partition size referred to as Striping Unit

Pieces of files are stored on multiple disks

Files can be broken up into bytes or blocks

Drive 0 Drive 1 Drive 2

Striping Unit

Page 13: RAID Systems Ver.2.0 Jan 09, 2005 Syam. RAID Primer Redundant Array of Inexpensive Disks random, real-time, redundant, array, assembly, interconnected,

Parity – Another method of data redundancy Take N pieces of data and calculate

another piece and store the N+1 pieces on separate drives

If any one of the N+1 pieces of data is lost, it can be recreated from the other N

RAID Concepts

Page 14: RAID Systems Ver.2.0 Jan 09, 2005 Syam. RAID Primer Redundant Array of Inexpensive Disks random, real-time, redundant, array, assembly, interconnected,

Parity Example

RAID Concepts

D1 = 10100101

D2 = 11110000

D3 = 00111100

D4 = 10111001

Parity = D1 XOR D2 XOR D3 XOR D4 = (((10100101 XOR 11110000) XOR 00111100) XOR 10111001) = 11010000Now five pieces of data are stored on five separate disks

Assume D3 becomes corrupt, can be restored by:D3 = D1 XOR D2 XOR D4 XOR Parity = (((10100101 XOR 11110000) XOR 10111001) XOR 11010000) = 00111100

Page 15: RAID Systems Ver.2.0 Jan 09, 2005 Syam. RAID Primer Redundant Array of Inexpensive Disks random, real-time, redundant, array, assembly, interconnected,

RAID operation in degraded stateTwo-drive mirrored => performance

equals that of a single driveStriped array with parity =>

regenerating lost information Rebuilding

Two-drive mirrored => copy entire good drive to replacement drive

Striped array with parity => must determine new parity information

RAID Degraded Operation and Rebuilding

Page 16: RAID Systems Ver.2.0 Jan 09, 2005 Syam. RAID Primer Redundant Array of Inexpensive Disks random, real-time, redundant, array, assembly, interconnected,

RAID can continue to operate during rebuild

Hardware RAID rebuilds faster than software RAID

Automatic rebuildController detects failed driveAutomatically rebuilds on

replacementManual rebuild

Administrator initiates rebuild Can be run in off-peak time

RAID Degraded Operation and Rebuilding

Page 17: RAID Systems Ver.2.0 Jan 09, 2005 Syam. RAID Primer Redundant Array of Inexpensive Disks random, real-time, redundant, array, assembly, interconnected,

RAID Reliability

Component Reliability System Reliability

Function of reliability of components

N

MTBFComponentMTBF

NMTBFMTBFMTBF

MTBF1

...11

1

21

NMTBFMTBF ...1

Page 18: RAID Systems Ver.2.0 Jan 09, 2005 Syam. RAID Primer Redundant Array of Inexpensive Disks random, real-time, redundant, array, assembly, interconnected,

RAID Reliability

Example RAID with 4 drives with MTBF 500,000

hours

Reliability decreased from 500,000 to 88,235 => decreased 82%

RAID reliability referred to RAID with fault tolerance

000,3001

000,5001

000,5001

000,5001

000,5001

1

4321

MTBF

235,88MTBF

Page 19: RAID Systems Ver.2.0 Jan 09, 2005 Syam. RAID Primer Redundant Array of Inexpensive Disks random, real-time, redundant, array, assembly, interconnected,

Ability of RAID system to withstand loss of some hardware without loss of data or availability

When fault occurs, array enters degraded state Drive must be replaced Array must be rebuilt

RAID Fault Tolerance

Page 20: RAID Systems Ver.2.0 Jan 09, 2005 Syam. RAID Primer Redundant Array of Inexpensive Disks random, real-time, redundant, array, assembly, interconnected,

Ability for users to access data Depends on:

Hardware Reliability Fault Tolerance Hot Swapping Automatic Rebuilding Service

RAID Availability

Page 21: RAID Systems Ver.2.0 Jan 09, 2005 Syam. RAID Primer Redundant Array of Inexpensive Disks random, real-time, redundant, array, assembly, interconnected,

Most RAID levels use striping Possible threats to data integrity

Unexpected Hard Disk Failure Failures of Support Hardware Physical Damage Software Problems Viruses Human Error

RAID Backups

Page 22: RAID Systems Ver.2.0 Jan 09, 2005 Syam. RAID Primer Redundant Array of Inexpensive Disks random, real-time, redundant, array, assembly, interconnected,

1988 Paper defined RAID levels 1- 5

Now single RAID levels 0 – 7Multiple RAID levels

RAID Levels

Page 23: RAID Systems Ver.2.0 Jan 09, 2005 Syam. RAID Primer Redundant Array of Inexpensive Disks random, real-time, redundant, array, assembly, interconnected,

JBOD – Just a Bunch Of Disks

Spanning multiple physical drives into one logical drive

No Fault Tolerance Not a RAID

Page 24: RAID Systems Ver.2.0 Jan 09, 2005 Syam. RAID Primer Redundant Array of Inexpensive Disks random, real-time, redundant, array, assembly, interconnected,

JBOD – Just a Bunch Of Disks

Page 25: RAID Systems Ver.2.0 Jan 09, 2005 Syam. RAID Primer Redundant Array of Inexpensive Disks random, real-time, redundant, array, assembly, interconnected,

RAID 0 - Striping

Disk Striping No parity Example:

Write essay with 3 hands instead of 1

Increased IO Not a valid RAID implementation

due to lack of fault tolerance

Page 26: RAID Systems Ver.2.0 Jan 09, 2005 Syam. RAID Primer Redundant Array of Inexpensive Disks random, real-time, redundant, array, assembly, interconnected,

RAID 0 - Striping

Page 27: RAID Systems Ver.2.0 Jan 09, 2005 Syam. RAID Primer Redundant Array of Inexpensive Disks random, real-time, redundant, array, assembly, interconnected,

RAID 0 - Striping Supported by all hardware controllers Supported by most software Minimum of two hard disks Array Capacity = Smallest Drive Size * # of

Drives Storage Efficiency = 100% of drive Fault Tolerance: None Availability: Lowest of all RAID levelsFailure results in array down until rebuild and

restore Degradation and Rebuilding: Not Applicable

Page 28: RAID Systems Ver.2.0 Jan 09, 2005 Syam. RAID Primer Redundant Array of Inexpensive Disks random, real-time, redundant, array, assembly, interconnected,

RAID 0 - Striping Random Read Performance: Very Good,

increases with larger stripe size Random Write Performance: Very Good,

increases with larger stripe size Sequential Read Performance: Very Good to

Great Sequential Write Performance: Very Good Cost: Lowest of all RAID levels Special Considerations: Daily Backups Uses: Non-critical data, Hobbyist, high-end

gaming

Page 29: RAID Systems Ver.2.0 Jan 09, 2005 Syam. RAID Primer Redundant Array of Inexpensive Disks random, real-time, redundant, array, assembly, interconnected,

RAID 1 - Mirroring

100% Data Redundancy No IO Speed Increase When a drive fails, the other

operates until drive replaced

Page 30: RAID Systems Ver.2.0 Jan 09, 2005 Syam. RAID Primer Redundant Array of Inexpensive Disks random, real-time, redundant, array, assembly, interconnected,

RAID 1 - Mirroring

Page 31: RAID Systems Ver.2.0 Jan 09, 2005 Syam. RAID Primer Redundant Array of Inexpensive Disks random, real-time, redundant, array, assembly, interconnected,

RAID 1 - Mirroring Supported by all hardware controllers Supported by most software Exactly two hard disks Array Capacity = Smallest Drive Size Storage Efficiency = 50% of drives Fault Tolerance: Very Good Availability: Very GoodMost allow hot spare and automatic rebuilding Degradation and Rebuilding: Slight

degradation of read, write improvesRebuilding is relatively fast

Page 32: RAID Systems Ver.2.0 Jan 09, 2005 Syam. RAID Primer Redundant Array of Inexpensive Disks random, real-time, redundant, array, assembly, interconnected,

RAID 1 - Mirroring Random Read Performance: Good Random Write Performance: Good Sequential Read Performance: Fair Sequential Write Performance: Good Cost: Relatively High Special Considerations: Size Limitation Uses: High Fault Tolerance without high

capacity, small databases, accounting and financial data

Page 33: RAID Systems Ver.2.0 Jan 09, 2005 Syam. RAID Primer Redundant Array of Inexpensive Disks random, real-time, redundant, array, assembly, interconnected,

RAID 2 – Memory-Style ECC Introduces Parity Same principle as ECC memory Bit-level with Hamming Code Not used today

Cost, Complexity

Page 34: RAID Systems Ver.2.0 Jan 09, 2005 Syam. RAID Primer Redundant Array of Inexpensive Disks random, real-time, redundant, array, assembly, interconnected,

RAID 2 - Memory-Style ECC

Page 35: RAID Systems Ver.2.0 Jan 09, 2005 Syam. RAID Primer Redundant Array of Inexpensive Disks random, real-time, redundant, array, assembly, interconnected,

RAID 2 - Memory-Style ECC Special Hardware Controller Required Typically 10 Data Disks and 4 ECC Disks Array Capacity = 10 * Data Disk Size Storage Efficiency = 71% of drives Fault Tolerance: Fair Availability: Very Good “On the fly” error correction Degradation and Rebuilding: In theory little

degradation

Page 36: RAID Systems Ver.2.0 Jan 09, 2005 Syam. RAID Primer Redundant Array of Inexpensive Disks random, real-time, redundant, array, assembly, interconnected,

RAID 2 - Memory-Style ECC Random Read Performance: Fair Random Write Performance: Poor Sequential Read Performance: Very Good Sequential Write Performance: Fair to Good Cost: Very Expensive Special Considerations: Not in modern

systems Uses: Not Used in Modern Systems

Page 37: RAID Systems Ver.2.0 Jan 09, 2005 Syam. RAID Primer Redundant Array of Inexpensive Disks random, real-time, redundant, array, assembly, interconnected,

RAID 3 – Bit-Interleaved Parity Byte Level striping with

dedicated parity disk Read requests hit all data disks Write requests hit all data disks

and parity disk Great for high bandwidth but not

high I/O rates Parity Disk can be bottleneck

Page 38: RAID Systems Ver.2.0 Jan 09, 2005 Syam. RAID Primer Redundant Array of Inexpensive Disks random, real-time, redundant, array, assembly, interconnected,

RAID 3 - Bit-Interleaved Parity

Page 39: RAID Systems Ver.2.0 Jan 09, 2005 Syam. RAID Primer Redundant Array of Inexpensive Disks random, real-time, redundant, array, assembly, interconnected,

RAID 3 - Bit-Interleaved Parity Medium to high-end hardware controller

required Minimum of three hard disks Array Capacity = Smallest Drive Size*(#

Drives–1) Storage Efficiency = (# Drives-1)/# Drives Fault Tolerance: Good Availability: Very Good Hot swapping and automatic rebuild Degradation and Rebuilding: Relative little

degradation and rebuilds can take many hours

Page 40: RAID Systems Ver.2.0 Jan 09, 2005 Syam. RAID Primer Redundant Array of Inexpensive Disks random, real-time, redundant, array, assembly, interconnected,

RAID 3 - Bit-Interleaved Parity Random Read Performance: Good Random Write Performance: Poor Sequential Read Performance: Very Good Sequential Write Performance: Fair to Good Cost: Moderate Special Considerations: Not as popular as

other Uses: Large Files with High transfer

performance, multimedia, publishing

Page 41: RAID Systems Ver.2.0 Jan 09, 2005 Syam. RAID Primer Redundant Array of Inexpensive Disks random, real-time, redundant, array, assembly, interconnected,

RAID 4 – Block-Interleaved Parity

Block-level striping with dedicated parity disk

Write requests use read-modify-write i.e. four disks (3 data, 1 parity) Small write request 4 disk IO

write the new data to disk 0 (1) read old data from disk 1 & disk 2 (2) Write parity information (1)

Parity disk can become bottleneck

Page 42: RAID Systems Ver.2.0 Jan 09, 2005 Syam. RAID Primer Redundant Array of Inexpensive Disks random, real-time, redundant, array, assembly, interconnected,

RAID 4 – Block-Interleaved Parity

Page 43: RAID Systems Ver.2.0 Jan 09, 2005 Syam. RAID Primer Redundant Array of Inexpensive Disks random, real-time, redundant, array, assembly, interconnected,

Medium to high-end hardware controller required

Minimum of three hard disks Array Capacity = Smallest Drive Size*(#

Drives–1) Storage Efficiency = (# Drives-1)/# Drives Fault Tolerance: Good Availability: Very Good Hot swapping and automatic rebuild Degradation and Rebuilding: Moderate if

drive fails and potential lengthy rebuild

RAID 4 – Block-Interleaved Parity

Page 44: RAID Systems Ver.2.0 Jan 09, 2005 Syam. RAID Primer Redundant Array of Inexpensive Disks random, real-time, redundant, array, assembly, interconnected,

Random Read Performance: Very Good Random Write Performance: Poor to Fair Sequential Read Performance: Good to Very

Good Sequential Write Performance: Fair to Good Cost: Moderate Special Considerations: Performance

depends on stripe size Uses: Not as common as level 3 or 5

Large Files with High transfer performance

RAID 4 – Block-Interleaved Parity

Page 45: RAID Systems Ver.2.0 Jan 09, 2005 Syam. RAID Primer Redundant Array of Inexpensive Disks random, real-time, redundant, array, assembly, interconnected,

RAID 5 – Block-Interleaved Distributed Parity

One of most popular levels Eliminates the parity disk

bottleneck by distributing parity information across the array

More efficient with small read and large write requests

Page 46: RAID Systems Ver.2.0 Jan 09, 2005 Syam. RAID Primer Redundant Array of Inexpensive Disks random, real-time, redundant, array, assembly, interconnected,

RAID 5 – Block-Interleaved Distributed Parity

Page 47: RAID Systems Ver.2.0 Jan 09, 2005 Syam. RAID Primer Redundant Array of Inexpensive Disks random, real-time, redundant, array, assembly, interconnected,

RAID 5 – Block-Interleaved Distributed Parity

Moderately high-end hardware controller required

Supported by some software solutions Minimum of the hard disks Array Capacity = Smallest Drive Size*(#

Drives–1) Storage Efficiency = (# Drives-1)/# Drives Fault Tolerance: Good Availability: Good to Very Good Hot swapping and automatic rebuild Degradation and Rebuilding: Can be

substantial due to distributed parity

Page 48: RAID Systems Ver.2.0 Jan 09, 2005 Syam. RAID Primer Redundant Array of Inexpensive Disks random, real-time, redundant, array, assembly, interconnected,

RAID 5 – Block-Interleaved Distributed Parity

Random Read Performance: Very Good to Great

Random Write Performance: Fair Sequential Read Performance: Good to Very

Good Sequential Write Performance: Fair to Good Cost: Moderate Special Considerations: Software RAID can

greatly affect performance due to parity calculations

Uses: Seen as Middle of RAID tradeoff triangle

Page 49: RAID Systems Ver.2.0 Jan 09, 2005 Syam. RAID Primer Redundant Array of Inexpensive Disks random, real-time, redundant, array, assembly, interconnected,

RAID 6 – P+Q Redundancy Block-level striping with dual

distributed parity Adds 2D parity information Can handle up to multiple disk

failures High data fault tolerance for

mission critical applications

Page 50: RAID Systems Ver.2.0 Jan 09, 2005 Syam. RAID Primer Redundant Array of Inexpensive Disks random, real-time, redundant, array, assembly, interconnected,

RAID 6 – P+Q Redundancy

Page 51: RAID Systems Ver.2.0 Jan 09, 2005 Syam. RAID Primer Redundant Array of Inexpensive Disks random, real-time, redundant, array, assembly, interconnected,

RAID 6 – P+Q Redundancy Special Hardware Controller Required Typically Minimum of 4 Hard Disks Array Capacity = Smallest Drive Size * #

Drives-2 Storage Efficiency = (#Drives - 2)/#Drives Fault Tolerance: Very Good to Great Availability: Great Degradation and Rebuilding: Can be

substantial due to dual distributed parity

Page 52: RAID Systems Ver.2.0 Jan 09, 2005 Syam. RAID Primer Redundant Array of Inexpensive Disks random, real-time, redundant, array, assembly, interconnected,

RAID 6 – P+Q Redundancy Random Read Performance: Very Good to

Great Random Write Performance: Poor Sequential Read Performance: Good to Very

Good Sequential Write Performance: Fair Cost: High Special Considerations: Tends to be used in

proprietary systems Uses: Where RAID 5 plus more fault

tolerance

Page 53: RAID Systems Ver.2.0 Jan 09, 2005 Syam. RAID Primer Redundant Array of Inexpensive Disks random, real-time, redundant, array, assembly, interconnected,

Multiple RAID Levels

Combine two single levels to obtain improved performance

Most common level 01 and 10 RAID level X+Y ≠ Y+X

Usually not much impact on capacity

More impact on fault tolerance

Page 54: RAID Systems Ver.2.0 Jan 09, 2005 Syam. RAID Primer Redundant Array of Inexpensive Disks random, real-time, redundant, array, assembly, interconnected,

Multiple RAID Levels RAID 01 vs 10 RAID 01

Strip Drives 1,2 RAID 0 for Stripe AStripe Drives 3,4 RAID 0 for Stripe BMirror two sets, if Drive 2 fails Stripe A is lost

RAID 10Mirror Drives 1,2 RAID 1 for Mirror AMirror Drives 3,4 RAID 1 for Mirror BStripe across A and B, if Drive 2 fails still have Drive 1 maintaining stripe

Page 55: RAID Systems Ver.2.0 Jan 09, 2005 Syam. RAID Primer Redundant Array of Inexpensive Disks random, real-time, redundant, array, assembly, interconnected,

RAID 10 – Mirrored Stripe

Mirroring and striping without parity

Most Common of Multiple levels Large arrays with high

performance and high fault tolerance

Page 56: RAID Systems Ver.2.0 Jan 09, 2005 Syam. RAID Primer Redundant Array of Inexpensive Disks random, real-time, redundant, array, assembly, interconnected,

RAID 10 - Mirrored Stripe

Page 57: RAID Systems Ver.2.0 Jan 09, 2005 Syam. RAID Primer Redundant Array of Inexpensive Disks random, real-time, redundant, array, assembly, interconnected,

RAID 10 - Mirrored Stripe Most Hardware Controllers Support Even Number with Minimum of 4 Hard Disks Array Capacity = Smallest Drive Size * #

Drives/2 Storage Efficiency = 50% Fault Tolerance: Very Good to Great Availability: Great Degradation and Rebuilding: Relatively Little

Page 58: RAID Systems Ver.2.0 Jan 09, 2005 Syam. RAID Primer Redundant Array of Inexpensive Disks random, real-time, redundant, array, assembly, interconnected,

RAID 10 - Mirrored Stripe Random Read Performance: Very Good to

Great Random Write Performance: Good to Very

Good Sequential Read Performance: Very Good to

Great Sequential Write Performance: Good to Very

Good Cost: High Special Considerations: Low storage

efficiency Uses: High performance and reliability,

enterprise servers

Page 59: RAID Systems Ver.2.0 Jan 09, 2005 Syam. RAID Primer Redundant Array of Inexpensive Disks random, real-time, redundant, array, assembly, interconnected,

RAID Comparison

SequentialWrite Perf

0 2,3,4,... S*N 100% none $1 2 S*N/2 50% $$2 many varies, large ~ 70-80% $$$$$3 3,4,5,... S*(N-1) (N-1)/N $$4 3,4,5,... S*(N-1) (N-1)/N $$5 3,4,5,... S*(N-1) (N-1)/N $$6 4,5,6,... S*(N-2) (N-2)/N $$$

01/10 4,6,8,... S*N/2 50% $$$05-50 6,8,9,10,. S*N0*(N5-1) (N5-1)/N5 $$$$

RAID Level

Number of Disks Capacity

Storage Efficiency

Sequential Read Perf Cost

Fault Tolerance Availability

Random Read Perf

Random Write Perf