24
Comparing Coordinated Garbage Collection Algorithms for Arrays of Solid- state Drives Junghee Lee, Youngjae Kim, Sarp Oral, Galen M. Shipman, David A. Dillow, and Feiyi Wang Presented by Junghee Lee

Comparing Coordinated Garbage Collection Algorithms for Arrays of Solid-state Drives

  • Upload
    trula

  • View
    57

  • Download
    0

Embed Size (px)

DESCRIPTION

Comparing Coordinated Garbage Collection Algorithms for Arrays of Solid-state Drives. Junghee Lee, Youngjae Kim, Sarp Oral, Galen M. Shipman, David A. Dillow , and Feiyi Wang Presented by Junghee Lee. High Performance Storage Systems. Server centric services - PowerPoint PPT Presentation

Citation preview

Page 1: Comparing Coordinated Garbage Collection Algorithms for Arrays of Solid-state Drives

Comparing Coordinated Garbage Collection

Algorithms for Arrays of Solid-state Drives

Junghee Lee, Youngjae Kim, Sarp Oral, Galen M. Shipman, David A. Dillow, and Feiyi Wang

Presented by Junghee Lee

Page 2: Comparing Coordinated Garbage Collection Algorithms for Arrays of Solid-state Drives

2

High Performance Storage Systems

• Server centric services– File, web & media servers, transaction processing servers

• Enterprise-scale Storage Systems– Information technology focusing on storage, protection, retrieval of

data in large-scale environments

Google's massive server farms High PerformanceStorage Systems

Storage UnitHard Disk Drive

Page 3: Comparing Coordinated Garbage Collection Algorithms for Arrays of Solid-state Drives

3

Emergence of NAND Flash based SSD• NAND Flash vs. Hard Disk Drives

– Pros:• Semi-conductor technology, no

mechanical parts• Offer lower access latencies

– μs for SSDs vs. ms for HDDs• Lower power consumption• Higher robustness to vibrations and

temperature– Cons:

• Limited lifetime– 10K - 1M erases per block

• High cost– About 8X more expensive than current

hard disks• Performance variability

MacBook Air

Failure

Read

Small

Write

Page 4: Comparing Coordinated Garbage Collection Algorithms for Arrays of Solid-state Drives

4

Intel X25-E 64GB SSD

Fusion io 640GBMLC PCIe DUO ioDrive

SSD-based Object Storage Target (OST)

• SSD-based OSTs– PCI Express SSDs

• Fusion IO ioDrive, Virident tachIOn, OCZ RevoDrive, etc

– SATA SSDs• Intel, SuperTalent, Samsung, etc

• PCIe SSDs versus RAID of SATA SSDs

PCIE SSD (Fusion IO)

Array of SATA SSDs

Performance Cost

High

High Relatively Cheap

SSD Type

Expensive

$13,990/640GB

$799/64GB

~1.3GB/s

~280MB/s

Page 5: Comparing Coordinated Garbage Collection Algorithms for Arrays of Solid-state Drives

5

Efficiency Analysis of SSD RAID

• RAID of SSDs– Configured 6 SSDs in RAID-0 using Mega RAID controller– Mega RAID controller is only able to connect up to 6 SSDs.

• Cost efficiency analysis– Metric (GB per $ and MB/s per $)– Compared RAID-0 of 6 x SATA SSDs versus 1 x PCIE SSD

• SSDs used

MLC SSDSLC SSD

Specification Size (GB)

Super-Talent MLC SATA II SSDIntel SLC SATA II SSD 64

SSD Type120

PCIe SSD Fusion-io ioDrive Duo MLC PCIe x8 SSD 640

Price ($)

799415

13,990

Page 6: Comparing Coordinated Garbage Collection Algorithms for Arrays of Solid-state Drives

6

Capacity Efficiency Analysis

• Total cost – N (RAID controller) x $ (RAID controller) + N (SSD) x $ (SSD)– We used $579 for PCIE LSI Mega RAID controller card.

Series10

1

2

3

4

5

6

GB per dollar

6 MLC SSDs (RAID-0) 6 SLC SSDs (RAID-0) 1 PCIe SSD

Device Type

Norm

aliz

ed E

ffien

cy

Page 7: Comparing Coordinated Garbage Collection Algorithms for Arrays of Solid-state Drives

7

Performance Efficiency Analysis

• Total cost – N (RAID controller) x $ (RAID controller) + N (SSD) x $ (SSD)– We used $579 for PCIE LSI Mega RAID controller card.

0 20 40 60 80 1000

1

2

3

4

5

6

MB/s per dollar

6 MLC SSDs (RAID-0) 6 SLC SSDs (RAID-0) 1 PCIe SSD

Read (%)

Nor

mal

ized

Effi

ency

Page 8: Comparing Coordinated Garbage Collection Algorithms for Arrays of Solid-state Drives

8

Lessons Learned

• From the cost-efficiency analysis, we learned:– RAID of SSDs is more cost-efficient than PCIE SSD in terms of capacity

per dollar and bandwidth per dollar. – In particular, MLC based SSDs in RAID is more cost-efficient than SLC

based SSDs.

• Then what are problems and challenges in SSD RAID?– Does SSD RAID offer sustainable bandwidth?– If not, why not? Any solution?

Page 9: Comparing Coordinated Garbage Collection Algorithms for Arrays of Solid-state Drives

9

RAID of SSDs

• Problems – Overall bandwidth of RAID of SSDs is dependent

on the slowest SSD.– During garbage collector (GC) is working,

incoming requests cannot be serviced.– GC process of each SSD in RAID of SSDs is

is not globally coordinated.• Challenges

– There is no functional support for coordinating individual GC processes at the conventional RAID controller.

– We need to develop a mechanism for RAID controller to be able to coordinate individual GC processes in RAID of SSDs.

• Idea and Solution– Harmonia! – A Coordinated Garbage Collector for RAID of SSDs

Page 10: Comparing Coordinated Garbage Collection Algorithms for Arrays of Solid-state Drives

10

Uncoordinated Garbage Collectors

Time

Local GC process Aggregate degraded performance

Page 11: Comparing Coordinated Garbage Collection Algorithms for Arrays of Solid-state Drives

11

Bandwidth Drop for Write-Dominant Workloads• Experiments

– Measured bandwidth for 1.87 MB by varying read-write ratio (qd=64)

RAID-0 of MLC SSDs RAID-0 of SLC SSDs

Performance variability increases as we increase write-percentage of workloads.

Page 12: Comparing Coordinated Garbage Collection Algorithms for Arrays of Solid-state Drives

12

Performance Variability

• Per-Drive Bandwidth (MB/s per drive)

1 SSD 4 SSDs 6 SSDs0

50

100

150

200

250

300

RAID of SSDs (M) Linear (RAID of SSDs (M))RAID of SSDs (S) Linear (RAID of SSDs (S))

Per-

Dri

ve B

andw

idth

(MB

/s)

Page 13: Comparing Coordinated Garbage Collection Algorithms for Arrays of Solid-state Drives

13

A Globally Coordinated Garbage Collector

Time

Local GC process Aggregate degraded performance

Page 14: Comparing Coordinated Garbage Collection Algorithms for Arrays of Solid-state Drives

14

Inclusive Coordination

Threshold

Free blocks Free blocks Free blocks Free blocks

Request

Force GC

Page 15: Comparing Coordinated Garbage Collection Algorithms for Arrays of Solid-state Drives

15

Selective Coordination

Soft

Free blocks Free blocks Free blocks Free blocks

RegisterForce GC

Hard

Request

Register

Page 16: Comparing Coordinated Garbage Collection Algorithms for Arrays of Solid-state Drives

16

Proactive Method

Soft

Free blocks Free blocks Free blocks Free blocks

Force GC

Hard

Idle Detected

Page 17: Comparing Coordinated Garbage Collection Algorithms for Arrays of Solid-state Drives

17

Design of Harmonia

• SSD optimized RAID controller (O-RAID)– A RAID controller designed to enable global coordination of

garbage collection when used with SSDs supporting that capability.

• Global GC optimized SSD (O-SSD)– An SSD designed for participating in a globally coordinated

garbage collection process in an O-RAID. • Extension of storage protocols

– Extension of storage protocols such as SATA and SCSI for controlling the additional capabilities of O-SSD device

Page 18: Comparing Coordinated Garbage Collection Algorithms for Arrays of Solid-state Drives

18

Experimental Setup

• Simulator– MSR’s SSD simulator based on DiskSim– Configured RAID-0 of 8 32GB SSDs using 4KB Stripe unit size

• Workloads– HPC-like Synthetic workloads

• Used the synthetic workload generator in DiskSim• HPC (W): 80% Writes, HPC (R): 80% Reads

– Enterprise-scale Realistic workloads

Workloads Average request size (KB)

Read ratio (%)

Arrival rate

(IOP/s)Financial 7.09 18.92 47.19

Cello 7.06 19.63 74.24TPC-H 31.62 91.80 172.73

OpenMail 9.49 63.30 846.62

Write dominant

Read dominant

Page 19: Comparing Coordinated Garbage Collection Algorithms for Arrays of Solid-state Drives

19

Results for HPC-like Synthetic Workloads

HPC (W) HPC (R)0

0.2

0.4

0.6

0.8

1

1.2

BaselineGGC

Norm

aliz

ed A

vera

ge

Resp

onse

Tim

e

HPC (W) HPC (R)0

0.20.40.60.8

11.21.41.6

BaselineGGC

Stan

dard

Dev

iatio

nResponse time improvements are 69% and 55% for HPC(W) and HPC(R) workloads respectively.

Significant improvement on standard deviations by GGC

69%55%

Page 20: Comparing Coordinated Garbage Collection Algorithms for Arrays of Solid-state Drives

20

Results for Realistic Workloads

Openm

ail

TPC-H

Financ

ialCell

o0

0.2

0.4

0.6

0.8

1

1.2

BaselineGGC

Norm

aliz

ed A

vera

ge R

espo

nse

Tim

e

Openm

ail

TPC-H

Financ

ialCell

o0

0.050.1

0.150.2

0.250.3

0.350.4

0.45

BaselineGGC

Stan

dard

Dev

iatio

nPerformance improvement is about 10%.

Standard deviation significantly improves by GGC.

Page 21: Comparing Coordinated Garbage Collection Algorithms for Arrays of Solid-state Drives

21

Comparison of Coordination Algorithms

Evenly dis-trib-uted

Skewed Bursty0

0.2

0.4

0.6

0.8

1

1.2

Number of Erase

InclusiveSelectiveProactive

Evenly dis-trib-uted

Skewed Bursty0

0.2

0.4

0.6

0.8

1

1.2

Average Response Time

InclusiveSelectiveProactive

Workload characteristics Workload characteristics

Page 22: Comparing Coordinated Garbage Collection Algorithms for Arrays of Solid-state Drives

22

Conclusions

• Empirical experiments using real SSDs– We showed that RAIDs of SSDs exhibit high performance variability due to

uncoordinated GC processes. • Harmonia: A coordinated garbage collector

– We proposed Harmonia, a global garbage collector, that coordinates the local GC process of the individual SSDs.

• Comparison of coordination algorithms– Selective method reduces the number of erase compared to the inclusive

method when the workload is skewed– Proactive method enhances the response time when the workload is bursty

Page 23: Comparing Coordinated Garbage Collection Algorithms for Arrays of Solid-state Drives

23

Thank you!

Page 24: Comparing Coordinated Garbage Collection Algorithms for Arrays of Solid-state Drives

24

Questions?

Contact info

Junghee [email protected] and Computer EngineeringGeorgia Institute of Technology